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This paper describes three new approaches to planning for digital 
loop carrier (DLC) in the exchange feeder loop network. Two of these 
are manual planning methods that are used to determine the most 
economical technology for satisfying facility shortages along a feeder 
route. Both are based on the concept of a distance-oriented crossover 
point beyond which Dtic is the economic relief technology. One of the 
methods uses a global crossover point for all routes, and the other 
uses a route-specific crossover point. The third approach is a mech- 
anized tool, the Pair Gain Planning (PGP) program. PGP, which is 
implemented as a module within the existing Bell System loop plan- 
ning system, first identifies the appropriate technology for relief and 
then synthesizes a specific DLC implementation plan. A discussion of 
system performance and projected Bell System applications of the 
various methods is included. 


l. INTRODUCTION 


For most of the twentieth century the subscriber loop network has 
been dominated by cable technology. Although carrier systems have 
been applied successfully in the rural environment’ for a number of 
years, it has only been in the very recent past that carrier has gained 
a foothold in the high-growth suburban environment. However, this 
situation is changing rapidly, and it is predicted that by the mid-to- 
late 1980s 50 percent of all loop growth will be served by digital loop 
carrier (DLC).” 


2129 


To plan for this growth, a detailed study procedure, the Suburban 
Pair Gain Planning (spGP) method, was developed.® spGP was designed 
for general DLC applications where existing Bell System tools, which 
were designed for rural applications,* were not appropriate. The sPGP 
procedure uses six tabular forms and the existing Bell System planning 
tool for conventional cable and structure relief—the Exchange Feeder 
Route Analysis Program (EFRAP)°—to determine a DLC relief plan for 
a feeder route. 

sPGP performs the two planning steps necessary for developing a 
relief plan. First, it determines the appropriate technology for satisfy- 
ing facility shortages along a feeder route (i.e., cable or DLC). Second, 
it selects from the possible alternatives an economic implementation 
plan for the selected technology. Although not optimal, the method 
has been shown to produce good relief plans on a wide variety of Bell 
System feeder routes. 

Since SPGP can be very time consuming to perform, three alternative 
planning systems have recently been developed for Bell System plan- 
ning applications. Two of these are manual methods that address the 
technology decision question and provide at least limited evaluation of 
alternative implementation plans. The third is a mechanized system 
that performs both the technological decision-making and the imple- 
mentation plan evaluation steps. 

In this paper we describe both the manual and mechanized systems 
and discuss their applicability and performance. First, we describe the 
manual methods. Both are based on the concept of a distance-oriented 
crossover point, a point on a route beyond which DLC is the economic 
technology for relief. Second, we describe the mechanized system, the 
Pair Gain Planning (PGP) program. PGP has been developed as a 
module that is incorporated within the existing EFRAP system. We 
conclude the paper with a discussion of experience using the various 
methods and their projected Bell System applications. 


ll. A TECHNOLOGY DECISION AID—THE CROSSOVER POINT 
2.1 Introduction 


A simple decision aid is needed to provide telephone company 
managers with a rapid means of evaluating proposed feeder projects 
and to provide planning engineers with a simple scale by which they 
can measure the economic feasibility of any proposal. In the interoffice 
trunk-planning environment, this need has traditionally been filled by 
using a length-oriented crossover point. A similar approach for the 
feeder plant has been developed. 

A carrier crossover point is defined as the point on a route beyond 
which it is generally more economical to relieve shortages with DLC 


2130 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1982 


than with cable. In contrast to the trunk network, which is a point-to- 
point network with uniform shortages and growths, the feeder plant is 
characterized as being branchy, with multiple gauges and demand 
points, tapering facilities, and shortages staggered over time. For this 
reason, it has been historically difficult to develop a crossover point 
for the feeder with any reliability. A new approach was needed. 

The following sections describe two new approaches to the devel- 
opment of a crossover point. The first addresses the more traditional 
problem of defining a global point applicable for any feeder route in 
the Bell System. The second develops a crossover point unique to each 
feeder route, which is generally less than or equal to the global 
crossover point. The global crossover has the advantage of being 
simple to apply in comparison with a route-specific point. However, a 
global crossover cannot be applied to individual routes since it only 
applies in the aggregate. Hence, both points are needed, one to perform 
top-down DLC studies, the other for bottom-up studies. 


2.2 Global crossover model 


A global crossover point can be developed from two approaches: (7) 
from sample statistics of feeder route economics, and (iz) from a model 
representation of a feeder route. The approach described here uses a 
model representation because we felt a model is more amenable to 
parametric “what-if” studies. The accuracy of the model can be 
verified by comparing results with a sample of actual routes. Devel- 
opment of such a model, its results, and verification are reported 
below. 

The Global Crossover Point model presumes an existing two-gauge 
cable loop from a central office (CO) to a variable end point. Forecasted 
growth is collected at the end of the loop and at the gauge change 
point. Relief requirements under both a cable-only and a DLC plan are 
then determined to meet this growth rate and associated present worth 
of expenditures (PWE) costs are computed for each plan. 

The model provides the means of evaluating the changes in crossover 
points by determining the PWE differences between an all-cable and a 
DLC relief solution. The model predicts the impact on the crossover 
point of relief growth, loop length, and the shortage sequence. 

A significant feature of this model is the presence of fine-gauge cable 
requirements in the Dic plan. This recognizes that any growth between 
the co and the crossover point must be served by cable, even in the 
DLC plan. However, the timing of this “residual cable” is determined 
by the sequence with which shortages occur along the length of the 
route. Two different relief-shortage sequences were used in the model, 
namely: (z) co to field and (iz) field to co. For simplicity, no conduit 
or other structure costs are assumed in the model. Economic sizing of 
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cables for each growth rate® and the most current estimates of costs, 
inflation rates, and the impact of special services are used. 

To determine the crossover point, a PWE is computed for each loop 
length, growth rate, and relief sequence studied for both a cable and a 
DLC relief plan. These PWEs can then be plotted against growth for 
each loop length. The crossover point for any growth rate is the point 
where the PwE for the cable and DLC relief plans are equal. The locus 
of such points across different length routes provides the desired 
function of crossover length versus growth rate. 

One advantage of this model is the ease of evaluating the impact of 
different cost assumptions. By graphically shifting the PWE curves, the 
impact on the crossover point is quickly determined. For example, a 
net circuit cost advantage was assumed for provisioning special services 
on DLC rather than cable, and the crossover points were recalculated. 
The study also looked at the effect on the crossover point of the 
intangible advantages of digital technology. Such advantages might 
include lower overall maintenance costs and possibly additional reve- 
nues from new services. Results of this study for a typical model and 
for the two relief sequences mentioned above are shown in Table I. 

These results show that adding intangible pic effects and special 
service advantages can move a crossover point distance by about 10 
percent. The relief sequence assumption has a much larger impact. 

A number of routes previously selected as candidates for DLC appli- 
cation were used to verify the model. The pGP program (see Section 
III) was then run on each route. No PWE penalty or DLC advantage for 
special services was assumed. All conduit requirements were assumed 
to be satisfied. The distance from the co to the closest pap target 
section was determined. (A target section is a feeder section that is 
more economical to relieve with DLC than with cable.) Since we are 
interested in near-term crossovers, only targets that required DLC relief 
in the first five years were selected. 

The verification confirmed the Sequence 1 (34 kft) worst-case cross- 
over. Of all routes studied (and several major branches on some 
routes), none had the closest pcp target section beyond 34 kft. Target 
sections ranged, however, from 8 kft to 34 kft. This suggests that a 
global crossover can only be used as an upper bound, as in Sequence 
1 above. Therefore, Sequence 2 results are not useful as a global 
crossover. 


Table |—Typical model results 
Nominal Cross- With 10% pic ~—- With Special Ser- 


over Point Advantage vice Advantage 
Relief Sequence 1 34 kft 32 kft 31 kft 
Relief Sequence 2 28 kft 26 kft 25 kft 
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2.3 Prescription design approach for route planning 


Prescription design (PD) is a manual planning tool that uses a PWE 
per line criterion to select one or more route-dependent crossover 
points. It requires a cable and structure relief plan for the route and a 
schematic layout showing the PwE for cable and structure (P;) and the 
average cumulative growth (g;) in each feeder section over the study 
period, as shown in Fig. 1. 

PD is intended for a non-EFRAP environment. Therefore, the cable 
and structure plan for the route is obtained using manual cable and 
structure sizing curves and economic study tools. These data can be 
used to develop a route economic profile (the piecewise linear curve in 
Fig. 2) that indicates the approximate cost per growth line to provide 
cable and structure relief from any point on the route to the co. 

To obtain a similar economic measure for DLC, a separate calculation 
is performed. The pwE for providing DLC relief during the same study 
period for various growth rates is represented by the continuous cost 
curve in Fig. 2. 

The area above the DLC curve depicts where DLC is more economical, 
and the area below the DLC curve depicts where cable is more econom- 
ical. The portions of the route profile that fall in this region identify 
the portions of the feeder route that should be relieved by DLC. 

In the example, DLc should be used for relief in route sections 3 and 
4, Once the sections for pic relief are identified, potential remote 
terminal (RT) sites are located at the field ends and are then timed and 
sized. 

In prescription design, timing and sizing algorithms are provided for 
two DLC deployment strategies, namely: (z) “growth only,” where only 
growth lines are committed to the RT; and (iz) “growth plus existing,” 
where growth lines plus some existing lines are cut over to the RT to 
make pairs available to relieve shortages closer to the co. The timing 
and sizing procedure is straightforward with the growth-only strategy, 
but the algorithm is more complex if a growth plus existing strategy is 
employed. 






SERVING AREA GROWTH RATE 


~“\_ CUMULATIVE GROWTH RATE 
100 


Pi SECTION PWE (CABLE AND STRUCTURE) 


Fig. 1—Route schematic model. 
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Fig. 2—Prescription design. 


lll. THE PAIR GAIN PLANNING PROGRAM 
3.1 Introduction 


The Pair Gain Planning (PGP) program is a new, mechanized plan- 
ning tool that integrates the technology decision and the selection of 
the best implementation plan into one mechanized process. After a 
base EFRAP run, which determines a near optimal all-cable relief plan 
for a study route, the PGP program uses a network flow model of all 
existing feeder capacity, proposed cable additions, and potential DLC 
implementations to find the best combination of cable and carrier to 
relieve the route, i.e., to make the technology decision. The application 
of network flow methodology to pLc planning provides near-optimal 
solutions quickly and efficiently. 

After the network flow algorithm has identified sections for DLC 
relief and a list of potential carrier serving areas (csAs)’ for activation, 
extensions of spGP algorithms are used to find the theoretical RT sites 
that will be activated. The program then times the placement of DLC 
systems at those sites. 

As shown in Fig. 3, the PGP program is designed to be an interactive - 
tool. After the program has developed a carrier relief solution, the user 
is given the option of modifying the solution to establish a relief plan 
that is most appropriate to the immediate local environment. When 
the user is satisfied with the solution, the PGP program creates a 
modified EFRAP input file that reflects the addition of DLC relief on the 
route so that EFRAP can be used to determine a new cable relief plan 
that satisfies any residual shortages that are not economic to relieve 
with DLC. 
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Fig. 3—PrGp terminal session. 


3.2 Network flow algorithm for target-section selection 


The program’s first task is to decide which cable and structure 
placements, as determined by a base EFRAP study, can be relieved 
more economically with pic. Generally, it is not economical to use DLC 
to replace all cables on the route, and the program must locate the 
economical DLC placements. Since it is not possible to look at each 
EFRAP Section individually to compare cable and DLC alternatives—as 
EFRAP does in its cable analysis—a global algorithm that examines the 
entire route at once must be used. 

This problem of finding cable relief projects that are more econom- 
ically relieved with DLC can be solved by a number of algorithms, but 
the one chosen for the PGP program considers the problem as a 
minimum-cost network flow problem. The advantages to this approach 
are that many efficient, easily programmable algorithms have been 
developed for solving this class of problem® and that standard software 
already existed for this purpose,’ thus shortening the PGP program 
development time. 

The minimum-cost network flow problem has been applied to a wide 
range of situations, including transportation of goods, design of pipeline 
systems, and production scheduling.*"*"! The general form of this 
problem is concerned with the flow of a commodity through a network, 
which is a directed graph defined by a set of nodes and a set of arcs 
connecting the nodes. The term “directed” implies that the commodity 
being studied can flow in only one direction, from a tail node to a head 
node. For each arc, there is a piecewise linear convex function that 
defines the cost per unit of flow over this arc as a function of its 
present flow. Upper and lower bounds of flow are also defined for each 
arc. Each node is identified as one of three types: (z) a supply node, 
where flow enters the network; (ii) a demand node, where flow leaves; 
or (iii) a transshipment node. No storage is permitted at nodes. There 
is also an objective function that usually minimizes the total cost with 
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flows that satisfy the upper and lower bounds on each arc and preserve 
the conservation of flow at each node. 

In mathematical form, the network is stated by a node-arc incidence 
matrix® A (an J x J matrix if the network has J nodes and J arcs), 
with elements 


+1 if arc 7 directed out of node 1, 
Ay = }) —1 if arc? directed into node j, 
0 otherwise. 


The problem is then stated as: 
min cx 
such that Ax =r 
1ls=x<=u, 


where 


x; is flow on arc] 

u; is upper bound on arc j 

1; is lower bound on arc j 

c; is cost for arc 7 (need not be linear in the general case) 
r; is supply (>0) or demand (<0) at node 1. 


Figure 4 shows a typical EFRAP feeder route layout, with appropriate 
dummy sections added for the DLC RT site at the field end of section 
1106. The network model for the same route is shown in Fig. 5, 
including additional arcs used to represent existing and future DLC 
capacity at potential RT sites. 

In the pcp program, the objective function is the PWE cost to provide 
relief to the route. The commodity flowing in the network is the 





EFRAP LOAD AREA 1107 


EFRAP LOAD AREA 1102 


Fig. 4—Typical ErraP feeder route layout. 
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Fig. 5—Network model for typical EFRAP feeder route layout. 


demand for loop feeder capacity, both existing and future, which must 
find its way from the supply node at the co to the nodes representing 
the various EFRAP load areas. The demands are the projected require- 
ments for facilities at the end of the study period (typically 20 years). 
To simplify implementation, this model assumes that all capacity 
additions and demands for the entire study period occur at the start of 
the study. The arcs of the model correspond to either (i) EFRAP 
sections with upper bounds representing existing facilities plus future 
cable additions from the base EFRAP study, or (ii) future DLC place- 
ments with RT sites at their head nodes. All lower bounds are zero. 
The cost function is zero for flows below the existing capacity on 
arcs representing cables or existing DLC in EFRAP sections. When this 
value is exceeded, a cost is incurred for the facilities that must be 
placed to support this flow. On cable arcs, this cost increases linearly 
using the per-pair cost of the cables placed by EFRAP, starting with the 
least expensive cable placed during the study period. This procedure 
yields a convex cost function. The cost function for future DLC arcs 
uses the DLC cost model described above to obtain a per-unit cost with 
the common equipment and site costs averaged over a full system. 
When the solution to the network problem is obtained, those phys- 
ical arcs whose flow does not exceed their existing capacity are the 
target sections where DLC relief should be used since the network flow 
algorithm has found it more economical to route flow over DLC arcs 
rather than increase flow through these cable arcs (which is equivalent 
to placing new cable). Also, those future DLC arcs with positive flows 
represent potential RT sites, since they indicate places where it is more 
economical to route flow via pLc. The timing and sizing routines 
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discussed below will select the actual RT sites used in the DLC relief 
plan from this group. 


3.3 Timing and sizing algorithms for RT locations 


The procedure of timing and sizing of csa theoretical RT locations is 
divided into four parts: (i) the selection of RT sites to activate; (ii) the 
determination of the amount of DLC, by time, to be placed at the 
location; (iii) the calculation of the PwE for the electronics placed; and 
(tv) the production of EFRAP data for the residual cable-relief recom- 
mendations. 

The following algorithm determines the order for activating RT sites. 
The goal is to eliminate the earliest shortages on the route first. The 
route is examined for target sections using the EFRAP path model for 
the route as a tree. The algorithm searches from the co to the ends of 
the route for the branch of the tree that has the target sections with 
the earliest shortage dates. The rT site that will be activated to relieve 
this branch is the potential RT site that is on the field side of the most 
recently selected target section and closest to that target section. 

The next step is to determine the amount of DLC needed to be placed 
at this location. The upper bound on the amount of DLC that is useful 
to place at that location is determined by the number of assigned pairs 
and growth pairs in the CSA associated with the RT site and with the 
cutover strategy that will be used for this RT site. The PGP program 
allows either a cutover strategy, which places all of the pairs in the 
CSA on DLC or places just enough pairs to assure that no cable will be 
needed in the target sections affected by this rT site. To be realistic, 
this upper bound is decremented to allow for DLc trunk pairs. For each 
year, the PGP program attempts to cover the shortages in the target 
sections on this branch with the available DLc. 

When enough theoretical RT sites are activated to relieve all target 
sections, DLC placements at the RT sites are used to revise the original 
EFRAP input data to reflect the DLC systems placed. These data are 
processed, and the EFRAP program determines the PwE for the residual 
cable needed for the relief solution. A comparison of the all-cable and 
the cable-and-DLc plan PwWEs then indicates the DLC savings. 


3.4 Interactive features 


The PGP program is an interactive system. The interactive alterna- 
tives are designed to assist the outside plant engineer in forming a DLC 
relief plan. The user has the choice of calling for an “automatic” 
solution or of hand-tailoring a solution. Several options are available 
to modify a solution or to demand a solution. These options create a 
broad spectrum of possible user control over the program solution. 
The pcp program interactive options are invoked within a framework 
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that allows the user to save the status of the work and return to that 
point at a later date. 


3.4.1 Automatic run 


The PGP program has the capability of producing a DLC relief plan 
with very little input from the user at the terminal. The user need only 
specify the run number of the EFRAP data that will be studied for DLC 
applications and the name of the DLC system that will be studied. 
From this input, the PGP program will execute an entire DLC study, 
producing a summary report that includes: 

© RT sites activated 

© Number of DLC systems placed 

© Schedule for pLc placements 

© PWE for DLC placements. 

The automatic run of the PGP program produces excellent DLC relief 
plans and is the recommended starting point for developing a final 
project. 


3.4.2 Solution modification 


In addition to the brief output reports given at the terminal with the 
automatic run, the user can request detailed reports for every RT site 
that is activated. After studying the output reports for a solution, the 
outside plant engineer might want to more accurately reflect in the 
study some conditions of the route. This is accomplished by choosing 
any of the modifying options: 

© Change the DLC cutover strategy 

e Change the list of target sections being considered for DLC relief 

© Change the list of potential RT sites 

© Change the list of forbidden RT sites 

© Change the DLc data associated with a particular RT site. 


3.4.3 Demand solutions 


To compare the costs of making working-pair transfers to the costs 
of activating additional csAs, the PGP program demand option is 
available. During the terminal session, the user associates a first year 
of activation and a DLC cutover strategy with each demanded RT site. 
All demanded RT sites are activated by the PGP program in the years 
specified by the user before any other RT sites that may be necessary. 
For demanded sites, the timing and sizing algorithm will select either 
a growth-only cutover strategy or one that cuts all assigned-plus- 
growth pairs in the CSA onto DLC. 


3.5 Results 


A side-by-side comparison of the PGP program network flow solution 
with the pp method was conducted using (EFRAP) route data for ten 
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Bell System routes. The PGP program DLC plus residual cable and 
structure solutions were generally lower in cost than was the equivalent 
PD solution; on an average the network flow solution showed a 7- 
percent lower PWE. (PWE figures do not include rearrangement costs, 
which were not computed for either study.) 

The PGP program is valuable in that the network flow algorithm 
provides very good starting solutions, and that the interactive options 
can be used to easily modify the starting solution to satisfy local 
requirements. During the PGP program field trial, which was conducted 
in three Bell System operating companies, users evaluated as many as 
ten alternative relief plans for a single route. Because the PGP program 
is a mechanized tool, they were able to do this within two or three 
days, whereas equivalent spGP studies would have taken over a month 
of engineering time. 


IV. STUDY PROCEDURE SUMMARY 


Since the pGP program performs both phases of feeder planning in 
an integrated fashion with little manual effort, the Bell System has 
recommended its use as the primary feeder-planning vehicle for DLC. 
However, global crossover points and PD both have a place in the 
planner’s arsenal. 

Global crossover points are still used in project review to rapidly 
assess the planner’s technology decision. As such, they effectively shift 
the burden of proof to cable on longer routes. The global crossover can 
also be used to guide the engineer in the interactive portion of PGP 
studies. Since many implementations of DLC with similar cost are 
possible on a route, the crossover can guide the planner to the solution 
that best meets company objectives, while PGP can measure the 
economic implications. 

For those engineering districts that have not yet implemented the 
prerequisite EFRAP, PD provides an effective interim procedure for 
planning digital carrier. (Although circumstances are changing quickly, 
a private 1981 survey showed that about one third of all Bell System 
engineering districts fell into this category.) The global crossover can 
be used in combination with PD to split the route into two parts. Only 
sections between the crossover and the co need to be evaluated with 
PD. 


V. CONCLUSION 


This paper described three tools for studying feeder relief in the 
digital age. Each tool has advantages and disadvantages, each has its 
place. There can no longer be a valid excuse for “business as usual” 
and the placement of metallic cable exclusively. We expect these tools 
to provide significant stimulus to the advancement of the digital age. 
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This paper describes algorithms for predicting end-to-end perform- 
ance measurements in private networks. Service characteristics such 
as end-to-end blocking and delay are calculated based on point-to- 
point traffic data and a network routing guide. These techniques 
have been incorporated in the Enhanced Network Administration 
System. The system is routinely used by AT&T Long Lines and 
operating company network administrators for private network de- 
sign and evaluation. 


1. INTRODUCTION 


An important procedure in network design is performance predic- 
tion. In a private network environment, a traffic engineer must rec- 
ommend a network design that satisfies the customer’s performance 
requirements. A set of computer programs called the Enhanced Net- 
work Administration System (ENADS) has been designed for adminis- 
tering Enhanced Private Switched Communications Service (EPSCS) 
networks and Electronic Tandem Switching (ETs) networks. The 
ENADS Network Service Evaluator (NETEVAL) uses point-to-point 
traffic data and a network routing guide to provide detailed ser- 
vice characteristics of a network design. This paper summarizes the 
NETEVAL algorithms. 

In addition to characterizing network service, NETEVAL is used, in 
conjunction with other ENADS modules,’ to ensure that the service 
characteristics chosen by the customer are achieved. The Network 
Synthesis (NETSYN) module designs a network that is close to the 
service specified by the customer. The service evaluator is used to 
ensure that the final network recommendation meets the customer’s 
service requirements. 

Traditionally, performance estimates of a large network in an alter- 
nate routing environment have been confined to a segment of a 
network, such as the intermachine trunk portion.” NETEVAL contains 
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Fig. 1—Private network configuration. 


decomposition and aggregation procedures that permit evaluation of 
complete end-to-end performance of a network as opposed to just a 
network segment. In addition, the ENADS evaluator generalizes the 
Katz algorithm to include private network features such as queues, 
and controlled access to network facilities via Facility Restriction 
Levels (FRLS). This paper describes the decomposition and aggregation 
procedures used in the evaluator, as well as the additional tools 
required to analyze some of the network features germane to private 
networks. 


Il. PRIVATE NETWORK CHARACTERISTICS 


A private network (see Fig. 1) interconnects customer locations (on- 
net service points) and other locations (off-net service points). An on- 
net service point is generally a PBX/Centrex or a key set connected 
directly to an access line group. Off-net points are served by off-net 
facilities (Foreign Exchange, WATS, or local off-net access lines). In 
this paper both access lines and off-net facilities are called end-links. 
One end of each end-link is associated with a service point and the 
other end is connected (or homed) to a switch. A network of interma- 
chine trunk groups connects the switches together, forming the inter- 
machine trunk-group portion of the network. Switches permit concen- 
tration of point-to-point traffic. 

Traffic generally originates at a service point, seizes an access line, 
and arrives at a switch. (It is possible in an ETs network for a service 
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point to reside at the same location as the switch, so that an access 
line is not required.) NETEVAL assumes that the switch examines only 
the final destination of the traffic and its FRL and then associates an 
ordered route list of links at that switch. The route list is scanned until 
an idle circuit is found. If no such circuit exists, the call is blocked (in 
the absence of queuing). If a circuit is seized, the traffic arrives at the 
next switch, and the routing procedure continues until the call is either 
completed or blocked. 

Other private network characteristics of interest for evaluation are 
queuing and FRLS. Queuing permits a call to wait at a specified trunk 
group for an idle circuit. The call is queued on a trunk group after a 
search through an ordered route list fails to find an idle circuit. The 
queued call may still be sent to reorder if the caller’s wait exceeds a 
time-out threshold. Also, the caller may abandon the queue. The 
queue discipline is first come, first served. 

FRLS provide the capability to restrict or expand access to network- 
facility route lists. The FRL can be based on the calling station and/or 
an authorization code. This provides the customer the opportunity to 
offer different grades of service by restrictive routing to different 
groups of users in the network. The FRL and final destination of a call 
are used in deciding to which links in the network the call will have 
access. 


lll. EARLIER SERVICE EVALUATION ALGORITHMS 


Statistical techniques are available for estimating service character- 
istics of networks. It is well known that if first-offered traffic is 
generated from a Poisson process and holding times are exponentially 
distributed, then network characteristics can be obtained by formulat- 
ing an appropriate Markov chain model and solving the resultant 
system of birth-death equations. This technique, however, requires an 
exorbitant amount of storage and computer time and is not considered 
feasible for large-scale networks. 

Another reasonably accurate statistical technique that requires 
knowledge of only the traffic mean and variance (or the variance-to- 
mean ratio, called peakedness) is the Katz algorithm,’ which was 
originally designed for evaluating switch-to-switch blocking probabili- 
ties. This algorithm requires estimates of switch-to-switch traffic 
means and variances that are “effectively” offered to each trunk group 
in the network. 

Traffic that is carried on a link may be blocked on subsequent links. 
The holding time for such traffic on seized circuits is substantially less 
than that for completed traffic on the network. Effective-offered traffic 
reflects the shorter holding times of subsequently blocked traffic. The 
effective-offered load is used to compute the link-blocking probabilities 
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and traffic-overflow variance for each link. The algorithm is an itera- 
tive process that updates effective-offered loads and link parameters 
at the end of each iteration. 

After these initial link parameters have been computed, the switch- 
to-switch loads are distributed through the network based on the 
network-alternate routing plan and the link parameters. As this traffic 
is routed through the network, the effective means and variances of 
the traffic offered to each link are accumulated to update the link 
parameters for the subsequent iteration. The blocked traffic is also 
accumulated to provide probabilities for that iteration. After all the 
switch-to-switch loads have been distributed throughout the network, 
the link parameters are updated and the entire procedure is repeated 
until convergence in the switch-to-switch loss probabilities is obtained. 

The Katz algorithm provides switch-to-switch blocking probabilities. 
Unfortunately, the number of calculations per iteration during the load 
assignment process is a function of the number of switches in the 
network. Traffic associated with each switch pair must be offered 
separately to each link in its routing path. Thus, an N-switch network 
must distribute N(N — 1) switch-to-switch loads through the inter- 
machine trunk network. Since private networks generally consist of 
several hundred service points, it is not computationally feasible to 
define each service point as a switch and use the Katz algorithm. In 
addition, the Katz algorithm does not analyze queuing. 


IV. NETEVAL ALGORITHM 


NETEVAL is an iterative algorithm that successively updates traffic 
loads and associated traffic characteristics until a convergence criterion 
is satisfied. Because of the size of the network, NETEVAL decomposes 
the network into an end-link portion and an extended-trunk portion 
(see Section 4.4) to compute the traffic characteristics. After conver- 
gence the network performance components are aggregated to obtain 
end-to-end network performance estimates. 

Sections 4.1 and 4.2 contain the NETEVAL assumptions and a brief 
outline of the algorithm. Subsequent sections explain in more detail 
the decomposition procedure, calculation of network parameters, and 
the aggregation techniques. 


4.1 Model assumptions 


The model assumptions for NETEVAL are as follows: 
(t) Traffic means and variances provide a sufficient description of 
the loads. 
(it) The holding time on a link has four components: actual message 
time, ringing time, link set-up times, and subsequent queue delays. 
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Under this assumption blocked traffic can contribute positive loads to 
the network. 

(zz) A blocked customer will redial with a specified retrial proba- 
bility and a variance that is a fraction of the overflow variance. 

(iv) No queue time-outs or abandonments occur. 

(v) The system is in statistical equilibrium. 

(vi) Only the final destination and FRL of a call are used at a switch 

to determine routing through the network. 


4.2 NETEVAL algorithm for service evaluation 


The basic algorithm procedure is given below. Some of the terms 

used are explained more clearly in later subsections. 
(t) Decompose the network based on routing into an end-link 
network portion and an extended trunk portion. 
(zi) Associate traffic parcels with each network portion. 
(tit) Initialize all parcel blockings. (Zero can be used if no other 
estimates are available.) 
(tv) Compute effective-offered parcels to the end-link network 
portion. 
(v) Calculate blocking probabilities for the end-link parcels. 
(vi) Compute total switch-to-switch and switch-to-final-destina- 
tion parcels offered to the extended trunk portion of the network. 
(vit) Calculate switch-to-switch and switch-to-final-destination 
blocking probabilities. 

(viii) If end-link, switch-to-switch, and switch-to-final-destination 
blocking probabilities change significantly, return to step iv. Otherwise, 
go to step Ix. 

(ix) Compute point-to-point characteristics. 


4.3 Decomposition 


As we mentioned in Section III, a typical private network is too 
large to model each service point as a switch and use the Katz 
algorithm. The network must be decomposed into segments and ana- 
lyzed separately. To define such a decomposition we must first define 
a traffic parcel. 

The aggregate of point-to-point loads with identical routing, when 
offered to a particular portion of the network, will be called a traffic 
parcel for that network segment. For example, if all point-to-point 
traffic originating on service points homed on switch I and destined 
for service points homed on switch J have identical route lists when 
offered to the inter-machine trunk-group portion of the network, then 
such traffic forms a switch I to switch J parcel in the inter-machine 
trunk-group subnetwork. Any technique using such a traffic aggrega- 
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tion assumes that the network characteristics of the traffic parcel 
sufficiently approximate those of the individual point-to-point loads. 

One method of decomposition is to separate the network into an 
inter-machine trunk-group portion and an end-link portion. Switch-to- 
switch parcels, as defined above, are associated with the inter-machine 
trunk-group network segment. All point-to-point traffic in the end-link 
segment, originating from a point to a switch and offered to the same 
route list of end-links, forms an originating end-link parcel. Similarly, 
all point-to-point traffic offered in the end-link network from a switch 
to a service point, using the same route list, is aggregated into a 
terminating end-link parcel. 

Since each parcel’s point-to-point load components have identical 
routing in its associated network segment, blocking can be computed 
for each parcel using the Katz algorithm. For example, switch-to- 
switch blocking can be computed for switch-to-switch parcels in the 
inter-machine trunk-group network segment, and originating and ter- 
minating parcel blockings can be computed in the end-link segment. 

The above decomposition facilitates aggregation of parcel charac- 
teristics to obtain end-to-end characteristics. For example, if points 1 
and 7 are homed on switches J and J, respectively, then i-to-7 blocking 
(by) is 


by = 1 — (1 — Bhorg)(1 — BY)(1 — By term). (1) 


(See Appendix A for a summary of the notations used.) 

The three factors in the product represent the probability of call 
completion for the originating parcel 7, the switch J to switch J parcel, 
and the terminating parcel 7, respectively. An implicit assumption in 
such a decomposition is that all i-to-7 traffic must use end-links homed 
on switch J for call completion. Or equivalently, the last switch, called 
the terminating switch, that completed i-to-7 traffic encounters must 
be switch J. Figure 1, however, displays a typical private network in 
which terminating switches are not unique. 

It illustrates a two-level hierarchical trunk-structure, with on-net 
service points homed on lower level switches J and -J, respectively. A 
bypass access line group is used to route traffic from switch K to point 
J. The two-level trunk hierarchy permits i-to-7 traffic to reach point 7 
using either switch J or K as a terminating switch. 

Since i-to-7 traffic is not required to use the home access line group 
serving 7, the above equation for 0; is inaccurate. The last two terms 
in the product of the equation do not provide the switch J to point 7 
blocking. To handle routing patterns with nonunique terminating 
switches, it is necessary to introduce a switch-to-final-destination 
parcel. Such a parcel represents all point-to-point loads from service 
points homed on a switch that has identical routing from the switch to 
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the destination service point. In Fig. 1, a switch I to point 7 parcel is 
required to compute i-to-j blocking. If B” is the blocking probability 
for such a parcel on the network, then 6; can be accurately estimated 
by 


by =1— (1 — Blog)(1 — B”). (2) 


Thus, the NETEVAL decomposition must classify traffic into four 
parcels for adequate performance estimation: originating end-link par- 
cels, terminating end-link parcels, switch-to-switch parcels, and switch- 
to-final-destination parcels. Switch-to-final-destination parcels are 
used only for service points that do not have unique terminating 
switches. To compute characteristics for switch-to-switch and switch- 
to-final-destination parcels, a segment of the network must be defined 
that contains all links used by these parcels. This subnetwork consists 
of all inter-machine trunk groups and those end-links contained in the 
route lists of the switch-to-final-destination parcels. Such a collection 
of links is called an extended trunk network in this paper. With the 
extended trunk network and its associated switch-to-switch and 
switch-to-final-destination parcel means and variances, the appropri- 
ate parcel characteristics can be computed. Similarly, the end-link 
subnetwork and end-link parcels permit the calculation of end-link 
parcel characteristics. Section 4.6 describes how parcel characteristics 
can be aggregated to form the desired end-to-end characteristics. 


4.4 Link analysis 


Both the end-link and switch-to-switch and switch-to-final-destina- 
tion analyses are based on offering effective-offered loads to a single 
trunk group and computing associated link characteristics that are a 
function of overflow parcel means and variances. Whenever alternate 
routing is involved in a network segment, the Katz algorithm is used 
on the appropriate section. (Even in a queuing environment this 
procedure is valid except that a new algorithm to compute link param- 
eters is used.) FRL analysis is included in NETEVAL by stratifying parcel 
loads by FRL and associating different route lists with each FRL 
grouping. 

Effective-offered loads are computed for each link, from which link 
parameters such as blocking and overflow variance are estimated. The 
effective-offered loads associated with a link must take into account 
interactions with the rest of the network. Traffic blocking both prior 
and subsequent to a parcel being offered to a link reduces the effective- 
offered traffic load to the link. However, network setup times, ring 
times, queue delays, and retrial attempts increase effective-offered link 
loads. Even though the end-link and extended trunk network segments 
are analyzed separately in NETEVAL, the effective-offered load equa- 
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Fig. 2—Link configuration. 


tions are designed to reflect the above network characteristics and 
thus account for the interactions between the network segments. These 
interactions are updated successively in the form of updated parcels in 
steps iv and vi as summarized in Section 4.2. 

The following subsections describe the computation of link param- 
eters both with and without queues. 


4.4.1 Unqueued case 


Figure 2 displays a general link configuration with link-offered 
parcels (a;, u;), 1 = 1 to m. a; is the mean of the 7th parcel and py; Is its 
variance. [If queues are present, the queue parcels are (gi, Ugi).] The 
equivalent random method can be used to compute overflow means 
and variances for aggregate traffic (a, v). To apportion the aggregate 
overflow moments to different parcels we use the Katz parcel-splitting 
approach. If b; is the ith parcel blocking, then 


b; = bo[1 + k(2i — Zo)], (3) 
where, 
b. = aggregate link blocking, 
Zo = peakedness of aggregate traffic, 
2: = ui/ai, and 
k = modification factor. 


Several heuristic formulas can be used for k. Katz? expresses k as a 
function of z, and 6,. We have selected the heuristic formula from the 
Defense Communications Engineering Center 


k — ceo, (4) 


where c = 2.249257" t = 0.0528257)8, s = 5.456207", and n is the 
number of trunks. 

Once 6; is known, the parcel overflow and carried mean estimates 
are a;b; and a;(1 — b;), respectively. The overflow and carried variance 
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are approximated by z3a;b; and vu; — z%a;b;, where zz is the overall 
overflow peakedness. The overflow variance estimate results in equal 
peakedness to all parcels. 


4.4.2 Queued case 


To derive the overflow mean and variance for a parcel offered to a 
link in the presence of queues, another approach is needed. We 
approximate the behavior of a link containing queues with a two- 
dimensional Markov chain for each peaked parcel and a one-dimen- 
sional Markov chain for each smooth parcel. The parcel is modeled 
based on its mean and variance. If the parcel is peaked (peakedness 
greater than one), an Interrupted Poisson Process (IPP) is used. Smooth 
parcels (peakedness less than one), which can occur when traffic 
carried on previous groups is offered to a trunk group, are modeled as 
a Poisson process with parameters adjusted to better reflect the small 
peakedness. 

4.4.2.1 Interrupted Poisson processes. An IPP, also called switched 
Poisson, is a Poisson process with rate A, which alternately for some 
period of time shuts off all arrivals from the Poisson process, and lets 
the arrivals go through for another period of time. Both time periods 
are exponentially distributed, independent of each other and any 
previous time periods. IPP is used to model traffic with peakedness of 
one or greater. We say that the switch is ON when the arrivals go 
through, and otherwise the switch is OFF. The IPP process is shown in 
Fig. 3. A Poisson process is an IPP with the switch constantly in the 
ON position. Given (a;, v;), an IPP with an expected on time y~’, an 
expected off time w *, and a Poisson rate A (when the IPP is in the ON 
state) can be obtained from the relationships 


¢) 
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where J is arbitrarily set to the larger of a; and vu;. uv; — a; + a? is the 
second factorial moment of the IPP. 

This completely specifies the IPP point process. For further proper- 
ties of the 1pp, see Kuczura.* 

4.4.2.2 Peaked parcel analysis. The 1pP ON and OFF states form one 
dimension of the Markov chain used in the queued case. The other 
dimension consists of three superstates: 

(t) Not-all-trunks-busy 
(ii) All-trunks-busy, but no-calls-in-queue 

(tit) Some-calls-waiting-in-queue. 

Note that in case (ivi), necessarily all trunks are busy, and that the 
system moves from state 1 only to state 2, from state 2 either to state 
1 or to state 3, and from state 3 only to state 2. No transition between 
states 1 and 3 is possible. 

Thus, the Markov chain space is (X, Y), X = 0, 1, Y = 1, 2, 3, where 
X is the 1pP status of a parcel offered to a link and Y is one of the three 
superstates discussed above. Figure 4 displays the flows among the 
states. 

Modeling the complex behavior of the trunk group with queues by 
compressing all states into superstates and assuming constant transi- 
tion rates between them is a crucial assumption. The simplified state 
space is not Markovian any more, and the rates between the states are 
generally nonconstant. However, for first-moment calculations, such 
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as the mean of the overflow stream, the replacement of the noncon- 
stant transition rate by the average rate gives exact results. The 
calculation of the rates is given below. Some of the rates are themselves 
approximations for the average rate. 

Since traffic is offered to the queues only when all trunks are busy 
in the group, we will consider the queue-offered traffic “conditional on 
all trunks busy,” and use the conditional mean and variance of these 
streams when computing link characteristics. If (a,, vg) is the uncon- 
ditional queue mean and variance, and p is the probability that all 
trunks are busy, then the mean and variance conditional on all-trunks- 
busy is given by 


Acq = aq/D, (7) 
2 
og at St (1). (8) 
Pp p\p 


These equations are based on the assumption that if the random 
variable X is the number of busy servers in an infinite trunk group 
receiving traffic from the queue stream, then EX’ = pEXt, i = 1, 2. 

4.4.2.3 Transition rates. To compute transition rates w?N and wf, 
we need to define w; as the average frequency of transition from not- 
all-trunks-busy states to all-trunks-busy states. Since this rate does 
not depend on the presence of queues, we may calculate w; on a trunk 
group without queues. Let br be the time congestion of a trunk group 
of size n, offered traffic with mean a and variance v. Further, let 7; 
and TJ. be the expected sojourn times of the trunk-group system 
(without queues) in states not-all-trunks-busy and all-trunks-busy, 
respectively. In that case 


T 


ioe 
cme area 


(9) 
Since T2 = n~* (assuming unit average holding time) for known br, 
one may find T;, and hence w; = 1/7), immediately. The time conges- 
tion br is equal to the blocking, b;, that would be experienced by a 
Poisson parcel offered to the trunk group. This is given by the Katz 
parcel splitting formula 


b; = bo[1 + k(zi -— 2.)], (10) 


where 2; is equal to 1. 

Now that w; is known we can solve for w?% and w?'. We require 
that in the long run the number of transitions from states [(1, 1), 
(0, 1)] to [(1, 2), (0, 2)] in Fig. 4 is w: per unit of time. Since movement 
among these states does not involve transition rates in or out of states 
[(1, 3), (0, 3)], it is more convenient to consider a smaller Markov chain 
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(X*, Y*), X* =0, 1, Y*= 1, 2, which is a subset of the chain shown in 
Fig. 4. We can then express the relationship between w, w?**, and w?% 


as 


ye OO ae (11) 
po. + p11 poi + pu 
where pj; is the steady-state probability distribution for the Markov 
chain (X*, Y*). 

We still require an additional constraint to make w?¥¥ and w?N 
unique. A reasonable additional assumption is that in the absence of 
queuing, the parcel blocking is the same as that obtained from the 
parcel splitting formula. Thus, we require 


P12 


b, = -———_ (12) 
Pu + pie 


where 0; is obtained from the Katz parcel splitting formula. The p,; are 
themselves expressed in terms of w?** and w?%. However, it can be 
shown that the equations above lead to an equation no worse than 
quadratic. 

To compute the transition rate y:, assume the system is in an all- 
trunks-busy state, with no-calls-in-queue. y: is the rate with which 
trunks become available. Since we express time in multiples of an 
average holding time, yi = 7 (the number of trunks). Further, under 
the assumption that the holding times are all independent, exponen- 
tially distributed variables, the transition rate y: is a constant, inde- 
pendent of the time spent in the all-trunks-busy state, with no-calls- 
in-queue. 

To compute w2 assume the system is in the state with all-trunks- 
busy and no-calls-in-queue. Then the frequency with which the state 
some-calls-waiting-in-queue is entered is the input rate into the queues. 
Since the queue-input rates a,g, and @,, are given conditionally, it is 
clear that 


2 = Acq, + Acqos (13) 


where w2 is a constant, rather than the average rate when the queue 
input streams are both Poisson. If the input streams are peaked, they 
may be represented as interrupted Poisson processes, and a more 
precise, but more elaborate calculation for w2 may be carried out, 
taking account of the variances of the queue-input streams. This 
precision was not considered necessary. 

Note that in all of the above, it is possible that one or both queues 
are non-existent. The mean @¢q should be set equal to 0 for non-existent 
queues. 

y2, the rate from the state calls-in-queue to the state with all-trunks- 
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busy and no-calls-in-queue, is the inverse of the mean length of time 
the system spends in the state with some-calls-in-queue. What we 
therefore need to calculate is the average duration from the time the 
first call enters one of the queues until the first subsequent time both 
queues are empty. We denote this mean duration by Q and call it “the 
queues busy period.” One then has 


y= Qn. (14) 


The calculation of Q is appreciably simpler for links with one queue 
than for links with two queues. In either case, one models the input 
stream(s) into the queue(s) as IPP(s), and then finds the simultaneous 
ergodic distribution of the number of waiting calls and the IPP input 
state. This ergodic distribution is a conditional distribution, given that 
all trunks are busy. The input streams are taken to be conditional on 
this event. 

Lastly, w. and y. are the on and off rates for an IPP parcel offered to 
a trunk group. Thus, eqs. (5) and (6) can be used to compute w, and 
Yo- 

4.4.2.4 Computation of link characteristics. The overflow mean, 
overflow variance, and carried variance of a parcel (a;, v;) are computed 
by adding another dimension to the state space, which is the number 
of calls in an infinite trunk group that receives parcel overflow traffic 
when the system is in an all-trunks-busy state. Moment-generating 
functions can be used to compute the desired quantities. (Details are 
provided in Appendix B.) The overall link blocking is then the ratio of 
the sum of the parcel overflow means to the sum of the parcel offered 
loads. 

Each of the parcels offered to the queues gives rise to two resultant 
parcels, namely the overflow parcel and the carried parcel. The queue 
overflow parcel consists of those calls finding all queue slots occupied, 
and.the carried parcel contains all other calls, since we assume that 
there are no abandonments or time-outs. 

As a result of assumed independence of the parcels offered to the 
group, the parcels offered to the two queues are independent during 
the time intervals that all trunks are busy. This means that waiting 
times and other variables associated with a two-queue trunk group 
may be estimated through simple Markov chain calculations. 

Queue blocking and delay are computed conditional on all-trunks- 
busy. A Markov chain (ji, jz, 11, i2), where the j’s are the number in 
each queue and the 2’s represent the on/off conditions of IPP queue 
streams, is used to compute these quantities. (Appendix C explains the 
algorithms used to compute the steady-state probabilities of the chain.) 
For example, if é;,,;,,i,,i, 18 the ergodic distribution of the chain and 
(y1, #1) represents the on and off rates of the @: arrival stream, then 
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Fig. 5—Markov chain for smooth parcel input. 


the Q, blocking probability is 
bg, = (: fs 2) > (€q,j10 + €q,/11)- (15) 
W1/ 7 


Since a customer can arrive at Q; only when the IPP is ON, eq. (15) 
is expressed only if the queue arrival stream turned on. 

4.4.2.5 Smooth traffic parcel analysis. A smooth traffic parcel is 
offered to a link when »u; is less than a;. A somewhat simplistic approach 
is used to compute the overflow stream characteristic for a smooth 
parcel. A one-dimensional Markov chain is formed, assuming the 
parcel is Poisson (see Fig. 5). All rates, except for w?™ (used in place of 
wr and w?**), are the same as in the case of the peaked traffic analysis. 
w?™ is computed in the same manner as w, but with z; set to the parcel 
peakedness. 

Overflow quantities can be calculated using the same techniques 
applied to peaked parcels. The corresponding variance calculation, 
however, leads to too large a variance of the overflow, because the 
input into the trunk-group characterization switch is taken to be a 
Poisson stream with mean and variance (a;, a;), instead of the given 
smooth stream with mean and variance (a;, v;). To correct for this 
effect, we reduce the calculated variance by multiplying it by the 
peakedness (less than one), of the offered parcel. The estimated 
overflow variance is 2;U?*, where u?* is the overflow variance based on 
Poisson input. The mean overflow is given by 


wi (we + 2) 


=. (16) 
(weM + y1)(we + y2) — yiwe 
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4.5 Convergence criteria 


The convergence criteria for step viii of the NETEVAL algorithm 
(Section 4.2) assumes convergence has occurred if the absolute differ- 
ence of each end-link, switch-to-switch, and switch-to-final-destination 
blocking probability in successive iterations is within a specified limit. 


4.6 Point-to-point characteristics 


Once the network has been decomposed into the extended trunk 
network and the end-link network, and the components have been 
analyzed, aggregation techniques are required to compute point-to- 
point characteristics. This is done by associating two or three traffic 
parcels, depending on the routing patterns, with each point-to-point 
pair. If i-to-7 traffic is part of a switch-to-final-destination parcel when 
offered to the extended trunk network, then only two parcels are 
associated with the point pair: the originating parcel at point i offered 
to the end-link network segment, and the switch-to-final-destination 
parcel in which it is contained. If the i-to-j traffic is part of a switch- 
to-switch parcel in the extended network, then three traffic parcels are 
associated with the point pair: the originating parcel, the switch-to- 
switch parcel, and the terminating parcel in which 1-to-j traffic is 
contained. Point-to-point characteristics are functions of the corre- 
sponding parcel characteristics. 

If i-to-7 traffic is part of a switch-to-switch parcel when traversing 
the extended trunk network, the 1-to-7 blocking 0j;; is 


by = 1 — (1 — Bh org)(1 — BY) (1 — By corm) (17) 


If i-to-7 traffic, however, is contained in a switch-to-final-destination 
parcel, 


bj =1— (1 — Blow) (1 — BY”). (18) 


The i-to-j expected queue delay, Wj;, is a weighted average of the 
expected queue delay on trunks and end-links. When i-to-7 traffic is 
contained in a switch-to-switch parcel, 


Wy = (1 — Blog) W" + (1— Blow)(1- B”)Wy. (19) 


The trunk delay, W”, and terminating link delay, Wj, are weighted 
by the probability that i-to-7 traffic will be offered to a link on which 
it is queue-eligible. When i-to-/ traffic is contained in a switch-to-final- 
destination parcel, 


Wi = (1 = Bi ore) w"”, (20) 


where W” is the expected queue delay for traffic originating at switch 
I destined for point 7. 1 — Bi org is the probability that the traffic seizes 
an end-link and arrives at switch J. 
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DISTRIBUTION pt = 17, 02 = 271 


NETEVAL-GENERATED 
DISTRIBUTION 2 = 16, a2 = 258 


TOTAL TRAFFIC IN PERCENT 





BLOCKING IN PERCENT 


Fig. 6—End-to-end blocking distribution—Network A. 


The 1-to-7 queue delay probabilities, PD;;, are computed analogous 
to W;;. When i-to-j traffic is contained in a switch-to-switch parcel 
when traversing the extended trunk network, 


PDy = (1 — Biorg)PD” + (1 — Boor) (1 — BY” — PD”)PD}, (21) 


where (1 — Biorg)(1 — B’” — PD”) PD/, is the probability of traversing 
the trunk portion of the network without queue delay but incurring 
queue delay on the terminating end-link. 

If i-to-7 traffic is contained in a switch-to-final-destination parcel, 


PD* = (1 — Bloxs)PD". (22) 


Conditional expected queue delays are given by W;;/PD,;, for PD; > 
0. Note that by creating an artificial access-line group with no blocking 
or delay, the above equations remain valid for ETS networks with 
traffic originating at a tandem switch. 


V. APPLICATIONS 


EPSCS and ETs networks generate call-detail records. These records 
contain information on individual calls in the network from which it is 
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—m—— CALL-DETAIL-GENERATED DISTRIBUTION = 4, ao? =69 


NETEVAL-GENERATED DISTRIBUTION = 6, o2=45 
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BLOCKING IN PERCENT 


Fig. 7—End-to-end blocking distribution—Network B. 


possible to estimate the end-to-end blocking distribution in existing 
networks. This information can be compared with an ENADS NETEVAL 
analysis of the same network. Figures 6 and 7 display both call detail 
and NETEVAL-derived blocking distributions for two existing networks. 

The means and variances in Fig. 6 are very close. The variances in 
Fig. 7 are not as similar as their corresponding means. This is primarily 
because the midpoint of the last cell in the call-detail-derived distri- 
bution is 43 percent, compared with 30 percent in the NETEVAL 
distribution. Note, however, that the overall shape of the distributions 
within Figs. 6 and 7 is similar. This suggests that the modeling 
techniques in our evaluation algorithm provide a reasonable prediction 
of end-to-end performance. 


VI. SUMMARY 


Algorithms for computing end-to-end and link characteristics in 
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private networks have been described in this paper. The algorithms 
generalize the Katz procedure to provide end-to-end characteristics for 
large-scale networks and also model private network features such as 
queuing and FRLs. These techniques have been incorporated into the 
ENADS Service Evaluator. NETEVAL performance predictions have 
compared well with actual measurements for several networks. The 
ENADS NETEVAL module is now routinely used with the companion 
NETSYN module by AT&T Long Lines and Operating Company private 
network administrators for network design and evaluation. 
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APPENDIX A 


Notations 


Much of the notation used in Section IV is summarized below for 
the reader’s convenience. 
Parcel Characteristics 
Blocking 
b; = Point i to point 7 blocking. 
Bi og = Originating end-link blocking for service point i homed 
on switch J. 
B? teem = Terminating end-link blocking for service point 7 homed 
on switch JJ. 
B” = Switch I to switch J blocking. 
B” = Switch I to final destination point 7 blocking. 
Delays 
W,; = Point z to point 7 queue delay. 
W? = Terminating end-link queue delay for service point j 
homed on switch J. 
W”’ = Switch I to switch J queue delay. 
W” = Switch I to final destination point 7 queue delay. 
PD, = Point i to point 7 delay probability. 
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PD = Terminating end-link delay probability for service point 
J homed on switch J. 
PD” = Switch I to switch J delay probability. 
PD” = Switch J to final destination point 7 delay probability. 
Link Parameters 
(a;, U:) = Mean and variance of ith parcel offered to trunk group. 
2; = Peakedness of ith parcel offered to trunk group. 
(Gy,, Ug;) = Mean and variance of load offered to queue. 
(Qeq,, Ucq,) = Mean and variance of load offered to queue conditioned 
on all trunks busy. 
b; = Blocking of ith parcel offered to trunk group. 
(a, v) = Aggregate trunk-group load mean and variance. 
2. = Trunk-group peakedness of offered load. 
zé = Trunk-group overflow peakedness. 
b, = Trunk-group blocking. 
Markov Transition Rates in Trunk-Group Queuing Model 
Yo = Interrupted Poisson Process off rate. 
w&, = Interrupted Poisson Process on rate. 
w?N = Not-all-trunks-busy to all-trunks-busy, no-calls-in-queue 
flow rate, given Interrupted Poisson Process (IPP) on. 
w. = All-trunks-busy, no-calls-in-queue to calls- waiting -in- 
queue flow rate. 
w?FF = Not-all-trunks-busy to all-trunks-busy, no-calls-in-queue 
flow rate, given IPP off. 
y2 = Calls-waiting-in- queue to all-trunks-busy, no-calls-in- 
queue flow rate. 
yi = All-trunks-busy, no-calls-in-queue to not-all-trunks-busy 
flow rate. 


APPENDIX B 


Overflow Mean and Variance of Peaked Parcel 


This appendix gives the details of the calculation of the overflow 
mean and variance of a parcel, modeled as an IPP, offered to a trunk 
group, which in turn is modeled by three states and the four transition 
rates among them. Let the variables A, yo, yi, yz, #0, @1, WY and we 
be as in Section 4.4.2. 

The system to be treated, including the infinite trunk group, is a 


continuous-time Markov chain, with state space 
N= {(R, LJ): k =0, 1 ---5;1=0, 1Lj= Ee2, 3}, 


where & is the number of calls in the infinite trunk group, 7 is the status 
(ON = 1, OFF = 0) of the 1pp, and 7 is the state of the trunk group, 
(not-all-trunks-busy, all-trunks-busy, but no-calls-in-queue, some- 
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calls-waiting-in queue). Let S,; denote the status of the IpPp and S2 the 
status of the trunk group. Designating by pzi; the steady-state proba- 
bility that the system is in state (k, i, 7), we define the generating 
functions 


ay(s)=  pus*, for OSsS1, i=0,1, j=1,2,3. 
k=0 


The mean number of calls in the infinite trunk group is 


d d 
* — k iy =— a ty = (pela 1 ’ 
a ») Pkij be as amy(1) = 1 as (1) 


where 1’ stands for the six element row vector of ones and 


mo1(S) 
7702(s) 
7703s) 
m11(s) 
712(s) 
713(s) 


m(s) = 


Similarly, the second factorial moment is 
2 2 


d d 
EX(X — 1) = R(k — 1) pry = Ys — (1) = 1’ — z(1), 
( ) me) ( ) Prij D age 70 ) Jet ) 
from which one easily finds the variance. Writing out the steady-state 
equations for pri; and summing appropriately, one finds that the 7;;(s) 
satisfy a system of equations, which we will define in vector form. We 
define 


—(wo + wPFF) v1 0 Yo 0 0 
weFF (wo + yt we) y2 0 Yo 0 
wie 0 We —(wo + y2) 0 0 Yo 
Wo 0 0 —(yo + w?) y1 0 . 
0 Wo 0 wPN —(Yo + yi t+ We) y2 
0 0 Wo 0 We —(Yo + 2) 


and the diagonal matrix B by 
B = diag(0, 0, 0, 0, A, A), 


where the elements of the diagonal are given inside the parentheses. 
The vector 7(s) satisfies the equation 


(s — 1) £ m(s) = An(s) + (s — 1)Ba(s). (23) 


Setting s = 1 in (23), one obtains Az(1) = 0, from which the marginal 
probabilities 7 (1) = P{S, = 1, S, = 7} are found. Taking derivatives 
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w.r.t. s in (23) and setting s = 1 gives 
d 
(I — A) — x(1) = Ba(1). 
ds 
Since 1’A = 0, one gets 
d d 
1’ — n(1) = 1’ — A) — x(1) = 1’Ba(1), (24) 
ds ds 


and hence the mean a* of the overflow may be found by solving for 
Az(1) = 0, with the side condition 1’7(1) = 1. Differentiating eq. (23) 
twice, one gets 


2 3 2 


d d d 
23 Ws) + (s — 1) G3 a8) = A a3 as) 


2 


d d 
+ 2B ds m(s) + (s = 1B ase m(s). 


d” : 
Setting s = 1 and solving for 73 m(1) gives 


d? caus 
7s m(1) = 2(2l -— A) B 7s m(1). 
Again, since 1’(2J — A) = 2 1’, one finds 
d’ d 
ly ds? m(1) =1’B as m(1), 


or, substituting (24), 
2 


d = 
l’ qs a(1) = 1’B(I — A)""Ba(1). 


Thus, by inverting the 6X6 matrix J — A, the variance of the overflow 
may be found. 

The carried traffic mean may be obtained directly from the overflow 
mean via 


ae =a-—a", 
where a, is the carried mean, a the offered traffic mean, and a* the 


overflow mean. It can be shown that a, may alternatively be found via 
the procedure sketched above, if we change the B matrix to 


B = diag(0, 0, 0, A, 0, 0). 


This change represents the fact that carried traffic leaves when the 
IPP switch is ON, and the trunk group characterization is in the not- 
all-trunks-busy state. 

With this change in B, the variance of the carried traffic may be 
calculated in exactly the same way as the overflow variance. 
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APPENDIX C 


Queue Analysis 


This appendix gives some details of the queue analysis. Assume a 
trunk group with n trunks, gi queue slots on one side, g2 on the other, 
and input streams into the queues given, conditionally on all-trunks- 
busy, by their 1pp parameters (A;, w:, yi), 1 = 1, 2. The system is a 
continuous time Markov chain. This appendix gives a description and 
a fixed ordering of the states of the system in Section C.1 and 
algorithms for the steady-state probabilities in Section C.2. 


C.1 Ordering of the states 


The elements of the different vectors and matrices to be used in this 
appendix are characterized by a four-dimensional index, (i, 7, 01, 02), 
where each possible index is a state of the system. Index 7 is the 
number of calls waiting in the first queue, 7 the number of calls waiting 
in the second queue, and o1 and o2 are status bits, indicating whether 
the IPP representing the input stream is currently in the OFF state 
(o = 0) or in the ON state (o = 1). Status bit ol refers to the first 
queue, 02 to the second. Index 7 runs between 0 and qi, j between 0 
and qe. All bounds are inclusive. 

A fixed order of enumeration is adhered to. There is a macro 
ordering, for the (i, 7) part of the state designations, and within that 
ordering, the status bits have a fixed order. 

The gross order is given by 


(0, 0), (1, 0), (2, 0), aes (1, 0), 
(0, 1), ( 1), . aioe (qn, 1), 
(0, Z): ° . coe . 

(0, gz), (1, qe), . aor (qn, qo). 


Within these index elements there are four states indicating the 
status bits, which are ordered like 00, 01, 10, 11. Thus, the overall order 
is indicated by 0000, 0001, 0010, 0011, 1000, 1001, 1010, 1011, 2000, etc. 


C.2 Steady-State Equations 


The steady-state equations form a set of 4 (q: + 1) (q2 + 1) equations 
in as many unknowns, with the normalizing side conditions that the 
probabilities add to 1. In the ordering given above, the structure of 
these equations is that of a tri-diagonal blocked matrix equation, as 
follows. Let e; be the vector of ergodic probabilities associated with 
the states having / calls in Q2, 7 = 0,1, -++ , qa, 1e., 


Ey = (€o;00, €oj01, €o710, C0711, €1j00, °*°* 5 €q, j11)- 
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Then the e; satisfies 


eoAe + e,C = 0, (25a) 
ej-1B + ejAg + ej4iC = 0, J=1,¢++,g2—-1 (25b) 
€y,-1B + €g,Ar = 0. (25c) 


Writing out these equations in matrix form will show the tri-diagonal 
block structure. 

The matrices A., A;, Az, B, and C are all square of order 4(q: + 1); 
in addition, B and C are both diagonal matrices. Below we discuss each 
of these matrices, and exhibit their structures. 

(t) Cis associated with a call arriving into Q2: the system moves 
from state (i, J, 01, 02) to state (i, 7 + 1, 01, 02), with transition rate Az 
if and only if o2 = 1. Therefore, C = diag(0, Az, 0, Az, - ++ , 0, Ag). 

(ti) B signifies the transitions caused by a call waiting in Q2 being 
served. The transition is from (i, J, 01, 02) to (l, 7 — 1, 01, 02), with rate 
nif i = 0, and with rate ne (where nz is the effective number of circuits 
available to calls from Q2 when Q, and @:2 are both non-empty) if 1 # 
0. Specifically, B = diag(n, n, n, n, Nz, Ne, +++, Ne). 

(iti) Ae, As, and Ag are all associated with transitions for which the 
number of calls in Q2 is constant, this constant being equal to 0 for A. 
(empty), ge for A; (full), and general for A,. These matrices have a tri- 
diagonal block structure themselves, with blocks of size 4X4, with the 
super- and sub-diagonal blocks associated with calls entering @: and 
leaving Q,, respectively, and with the diagonal blocks modeling the 
transitions between the input status states (01, 02) = (0, 0), (0, 1), (1, 
0), (1, 1). Ae, Ay and A, have slight differences owing to the fact that 

(a) The diagonal element for any state contains the negative sum of 
the transition rates out of that state. 

(6) The rate at which calls leave @: depends on whether Q>2 is 
empty. 

Next we show that we may take advantage of the block structure of 
the 4(q: + 1)(q2 + 1) set of steady-state equations, and reduce it to a 
set of size 4(qi + 1) to be solved via brute-force matrix inversion 
methods. Consider equations (25). Since B is a nonsingular diagonal 
matrix, we may write 


Cq,-1 = —eg,A;B™ 
€q,-2 = —€q,(CB~' + A;B'A,B™’). 


In general, if e; = e,,D; for 7 =k+1,k + 2, ---, ge, one obtains D; 
from equation (25b) 
Dp = —DrxiAgb" _ Drs2CB™ (k =0,1,-+-, G2 — 1), 


with D,, = J, the unit matrix, and D,,+1 = 0. 
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Assume then, that we have expressed ep in terms of e,,. Now we use 
the top equation to solve for e,, and get 


€q,(DoAe + DiC) = 0. (26) 


The requirement that all probabilities sum to 1 can likewise be 
expressed in terms of eg, only: 


del = Leg,D,1 = 1. 
Thus, we sum the rowsums of the D,, obtaining 
x= D;1. 


We now replace one column of DoA- + DiC by the vector x, and the 
corresponding entry in the 0 vector in (26) by 1, and solve this modified 
set of equations. We obtain e,,, and from it may find all ergodic 
probabilities by postmultiplying e,, by the D,’s. 

If the number of queue slots for Q1 is larger than for Q2 (qi > qe), 
we may choose to relabel @1 and Q2, and reduce the problem size by 
performing the recursive procedure described above over the larger of 
qi and gz. Thus, we may reduce the size of the matrix to be inverted 
from 4(q: + 1)(qg2 + 1) to 4[1 + min(q:, ge) ]. 
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A Description of the Bell Laboratories Scanned 
Acoustic Microscope 


By P. SULEWSKI,* D. J. BISHOP, and R. C. DYNES 
(Manuscript received February 4, 1982) 


We have developed a working scanning reflection acoustic micro- 
scope. In this paper we describe its construction and operation and 
also present preliminary acoustic micrographs and compare them 
with equivalent optical and electron micrographs. Our instrument 
operates at room temperature using 2-GHz acoustic radiation with 
water as a coupling medium, and has a resolution of approximately 
1 um. We also discuss improvements to be made in future instruments 
with liquid helium as a coupling medium. 


l. INTRODUCTION 


Scanning reflection acoustic microscopy uses the amplitude of re- 
flected high-frequency sound waves as a contrast mechanism to gen- 
erate micrographs with submicrometer resolution. This method was 
first reported in 1974 by R. A. Lemons and C. F. Quate.’ Since that 
time, much progress has been made in the field, including a steady 
improvement in the resolution of the instrument.” To date, acoustic 
microscopy has been used to study integrated circuits, biological 
specimens, and various materials with much success.” 

Although acoustic micrographs appear quite similar to their optical 
counterparts, the source of the acoustic contrast lies in the mechanical 
properties of the sample. Hence, acoustic micrographs provide infor- 
mation that is fundamentally different from that of optical micro- 
graphs. Since the acoustic reflectivity at a surface is a strong function 
of the layering structure beneath,’? much effort has been directed 
toward nondestructive analysis and characterization of integrated cir- 
cuit defects using acoustic microscopy. In our research, it is hoped that 
variations in stress and crystal orientation near dislocations in crystals 


* Princeton University. 
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Fig. 1—The lens configuration of the scanning acoustic microscope. 


will provide sufficient acoustic contrast to image dislocations, providing 
a powerful tool for nondestructive analysis. 

Here we present micrographs from some preliminary surface studies, 
indicating the performance of our acoustic microscope, along with an 
explanation of its basic operation and components. 


ll. PRINCIPLES OF OPERATION 


Our micrographs are generated using the acoustic lens pictured in 
Fig. 1. While using such a simple lens optically would result in severe 
distortion owing to spherical aberration, such effects are negligible in 
the acoustic case, since the distortion is proportional to the square of 
the ratio of the velocity of sound in the two media, sapphire and water. 
Since Vaio, = 11.1 km/s and Vu,o = 1.5 km/s, such aberration effects 
can be ignored.’ A coupling medium, in our case water, is used since 
air cannot transmit 2-GHz acoustic waves with acceptable losses. 

The lens focuses the acoustic waves to a point, whose actual width 
is diffraction limited to of the order of A. With the sample at this focal 
point, the acoustic waves are reflected back through the lens. The 
amplitude of the reflected waves provides the contrast for each reso- 
lution element in the final picture. Light and dark areas arise from 
variations in the acoustic reflectivity of the sample. Since the focal 
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9 — 3-GHz LOW-PASS FILTER-MICROLAB 
LA-30N 
10 — CIRCULATOR—ADDINGTON LABORATORY 
100100007 
11— ACOUSTIC LENS ELEMENT-STANFORD 
UNIVERSITY (1.7-2.1 GHz) 
12 — SWITCH-HP 8731B (0.8-2.4 GHz) 
13 — SLAVE PULSER-HP 8403A 
14 — 2.2-GHz LOW-PASS FILTER—HP 360C 
15 — 1-GHz HIGH-PASS FILTER 
16 — AMPLIFIER-TRON-TECH W2GE 


17 — AMPLIFIER—TRON-TECH W2GC 

18 — STUB TUNER 

19 — DETECTOR-HP 440A 

20 — BOXCAR INTEGRATOR-PRINCETON 
APPLIED RESEARCH-PAR 160 

21 — SCAN CONVERTER-PRINCETON 
ELECTRONIC PRODUCTS-PEP 500 

22 — VIDEO SCREEN-CONRAC 

23 — OSCILLOSCOPE-TEKTRONIX 7834 

24 — LOCK-IN AMPLIFIER, PAR 124A, 
WITH DIFFERENTIAL PREAMP, PAR 116 

25 — PHASE-LOCKED SIGNAL GENERATOR. 
(a) TRIGGER/PHASE LOCK-HP 3302A. 
(b) FUNCTION GENERATOR-HP 3300A. 

26 — PHASE SHIFTER—ZERO CROSS DETECTOR 
USED TO BLANK ON RETRACE 

27 — LOW-NOISE PREAMPLIFIER-ITHACO 1201 


y DRIVE 


Fig. 2—Block diagram of the electronics for the microscope. 
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Fig. 3—The mechanical stage for the scanning acoustic microscope. 


point has finite width, the resolution of the resulting micrograph is 
limited to ~0.75A, as calculated by Wickramasinghe.* 

To create a picture, the sample is scanned in a raster pattern, line 
by line. This is accomplished by vibrating the sample at a frequency 
of 37 Hz and an amplitude of +250 ym in the x direction, while driving 
the sample through 500 ym in the y direction in 10 to 15 seconds, 
completely scanning the sample. The amplitude of the reflected acous- 
tic wave is then used together with x and y positioning information 
given by linear variable differential transformers (LVDT) to create a 
video image by a scan converter. 

The acoustic waves are generated by a piezoelectric transducer on 
the back of the lens element, which converts microwaves at a frequency 
near 2 GHz to acoustic waves of the same frequency. 

A master pulser, (1) in Fig. 2, governs the timing of the electronics 
that control and process the microwave power. The master pulser 
generates trigger pulses at a rate of 500 kHz. The microwaves are 
generated (2) and then amplified (3) to a power level of 1 watt. Three 
PIN-diode switches (4, 5) connected in series chop the continuous 
waves into 18-ns pulses at a rate set by the master pulser. An inter- 
mediate slave pulser (6) drives the switches. The use of pulsed waves 
allows us to isolate the desired information pulse from spurious reflec- 
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(c) 


Fig. 4—Acoustic micrograph of gold mask on epitaxially grown LED (layers of InP, 
InGaAsP,InP). (a) Before cleaning. (b) After swabbing with cotton swab soaked in 
ethanol. (c) After swabbing with alcohol. (d) With lens off focus slightly (or at different 
focus). 





Fig. 5—Gold stripes on silicon. 


tions. Three switches are used in series to increase the on-off ratio. 
Since the power of the desired reflected signal pulse is generally 80 dB 
below that of the initial pulse, any leakage of microwaves during the 
off state would produce undesirable interference patterns. 

The microwave pulse then passes through an isolator (7) which 
protects the system from spurious multiple reflections. A 1-GHz high- 
pass filter (8) and a 3-GHz low-pass filter (9) eliminate low-frequency 
and high-frequency noise, respectively, from the switches. A circulator 
(10) directs the pulse into the lens element (11), and then directs the 
reflected signal pulse into the rest of the circuit. 

At the back of the lens, the pulse is electrically matched to the 
piezoelectric transducer as in Fig. 1. Even with this matching network, 
there is a large reflection at this point. The transducer, composed of a 
gold-ZnO.-gold sandwich, evaporated onto the back of the sapphire, 
converts the microwave pulse into an acoustic pulse of the same 
frequency, which then travels through the sapphire to the lens. At a 
simple lens-water interface, most of the incident radiation would be 
reflected since there is a large impedance mismatch, as Zy,0 = 1.5 X 
10°gm/cm?-s and Zai,o, = 44.0 X 10°gm/cm’-s. A one-quarter-wave- 
length impedance-matching layer of borosilicate glass at this interface 
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Fig. 6—A different region of gold stripes on silicon pattern shown in Fig. 5. 


significantly increases transmission although it does not completely 
eliminate this reflection.” The transmitted and focused acoustic pulse 
then reflects off of the sample and back through the lens to the 
transducer, where it is reconverted into microwaves. Losses in the 
water amount to ~70 dB. The amplitude of the reflected pulse depends 
upon the acoustic properties of the sample; hence, the contrast in the 
resulting micrograph reflects variations in acoustic properties of the 
sample. The pulse passes through an isolator (7) and a final PIN-diode 
switch (12), which acts as a window to protect the amplifiers from the 
large spurious reflections already mentioned. Two filters (14, 15) 
reduce the noise, especially low frequencies from the switch. The 
amplifiers (16, 17) give a total of 62-dB amplification. A stub tuner 
(18) matches the incoming transmission line to the crystal-diode de- 
tector (19). The detector rectifies the pulse of microwaves, and a 
boxcar integrator (20) is used to improve the signal-to-noise ratio. The 
scan converter (21) uses the x and y positioning information together 
with the signal pulse amplitude to construct a video picture, which 
may then be photographed from the screen (22). 

The scan electronics provide the means to scan the sample in a 
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Fig. 7—Gold stripes with scratches on silicon. 


raster pattern. The resolution of the LvpTs used must be submicro- 
meter if the resolution in the microscope itself is to be submicrometer. 

The sample stage (shown in Fig. 3) is mounted on leaf springs so 
that vibration of the sample is possible only in the x direction. The 
frequency of this vibration is held at resonance, approximately 37 Hz, 
using a phase-locked signal generator (25). Translation in the y direc- 
tion is accomplished by an optical translation stage driven by a dc 
motor. Most of the weight of the stage is supported by a spring to 
relieve the motor, so that it can be operated without stalling. 

The x positon of the sample is given by an ac LVDT connected to a 
lock-in amplifier (24). The y position is given by a de LVDT whose 
output voltage is proportional to the displacement. It was found that 
the 10-kHz modulation frequency used to operate the LvpTs leaked 
into both x and y position signals, causing image degradation and poor 
resolution. Filtering these signals resulted in much improved images. 

Since the theoretical resolution is proportional to A, increasing the 
frequency should improve the resolution. However, for water and 
many other liquids, the acoustic attenuation is proportional to f, so 
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Fig. 8—Gold cross on silicon. 


increasing the frequency dramatically increases the power losses within 
the coupling medium.” To preserve the signal-to-noise ratio, 1.8 to 2.0 
GHz is the present optimal frequency range. Other coupling media 
continue to be investigated. 

We use water heated to 60°C, where the acoustic attenuation is 
lower than at room temperature, resulting in an improved signal-to- 
noise ratio. 

We have found that both samples and lens must be kept very clean 
if the microscope is to function at all. A very thin film of oil, like that 
deposited by a fingerprint, can reduce the amplitude of the reflected 
pulse dramatically. Also any small dust particle, with a diameter of 
several micrometers, which becomes lodged in the lens will distort and 
even eliminate the reflected signal pulse. 


ill. IMAGES 


For our initial testing of the microscope, various surface structures 
were examined, and comparisons performed with optical and electron 
micrographs. 
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Fig. 9—Bulk-epitaxial transition InGaAsP; left, bulk; right, epitaxial. 


The first series of micrographs in Fig. 4 of a gold mask on InGaAsP 
indicates the importance of surface cleaning. Any oil or grease, for 
example a fingerprint or any of the vacuum grease used to fix the 
sample to the stage, shows up quite clearly and obscures the true detail 
(Fig. 4a). Cleaning the surface with a cotton swab soaked in ethanol 
significantly reduces the surface structure of the first micrograph (Fig. 
4b,c). Fig. 4d presents another method of discerning true structure 
from mere surface dirt effects. By maintaining the lens slightly out of 
focus, the effects of surface dirt disappear. Also note the phase shifts 
at the edges of the gold to InGaAsP transition owing to different 
acoustic path lengths in the materials. The diagonal lines visible in 
Fig. 5, 6, and 7 are caused by noise over the signals given by the LvDTs. 
This effect was eliminated from subsequent micrographs. The noise 
produced an uncertainty of several micrometers in the sample position, 
seriously degrading image resolution. 

Fig. 8 displays the dramatic improvement in image quality achieved 
by eliminating the noise over the LvpT signals. The horizontal dark 
and light bands result when the de motor alternately slows down or 
speeds up within the single frame. Near the bottom of the picture, 
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taxial layer. (a) Optical. (b) Electron. (c) Acoustic. 


ion on epi 


Fig. 11—Indium inclus 
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Fig. 13—(a) A sample of InGaAsP is shown with a gold evaporation on the surface. 
The dark regions are indium inclusions which have formed on the surface. The field of 
view is ~500 pm X 500 um. (b) A close-up of an inclusion from Fig. 13a at ten times the 
magnification. 
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Fig. 14—A more recent micrograph of gold on silicon showing the quality of the 
acoustic micrographs we have been able to obtain. 


single scan lines are visible. This problem has been overcome with a 
better mechanical drive. 

One area of application of the acoustic microscope is the evaluation 
of subsurface defects within epitaxial layers. Preliminary to such 
studies, we have examined various samples of epitaxially grown wafers 
with inclusions visible on the surface. The first picture in the series, 
Fig. 9, shows a transition from bulk to epitaxial InGaAsP, from left to 
right. 

In Fig. 10, we compare optical, electron, and acoustic micrographs 
of an indium inclusion in an epitaxial layer of InGaAsP. In the acoustic 
picture, in the lower right opening in the inclusion, several round spots 
a few micrometers across appear, which do not appear in either the 
electron or optical micrographs. These could be subsurface features 
that only the acoustic microscope can image. 

The next series of pictures in Fig. 11 show another indium inclusion 
on the same sample. Again while more surface features are visible with 
optical and electron microscopy, there is a round spot 5 pm in diameter 
near the lower edge of the inclusion in the acoustic micrograph that 
does not appear in either of the other pictures. 
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Fig. 15—This micrograph shows an image where the derivative of the acoustic 
intensity is plotted. This type of photo provides a sense of depth perception to the 
photos and generates a microscopic contour map. 


In Fig. 12, we present a comparison between electron and acoustic 
micrographs of a 50-ym gold strip evaporated onto silicon. The scratch 
marks provide a good measure of the resolution of the instrument. The 
top two horizontal scratches are 5 um apart, and easily discerned in 
the acoustic micrograph. The bottom three scratches are each 2.5 um 
apart from the other and also easily discernible. However, the two 
diagonal scratches, separated by 1 ym, are not distinguishable in this 
micrograph. 

In Fig. 13, we show a recent micrograph of an InGaAsP sample with 
a gold overlayer. The instrument has been optimized for this series of 
pictures and gives some indications of the quality that it is capable of. 
The small dark regions are indium inclusions which have formed on 
top of the sample. Figure 13b, is a micrograph of a single inclusion 
seen in the center of the upper photo. 

In Fig. 14, we show a picture of gold stripes evaporated on a silicon 
substrate. Finally in Fig. 15, we show the results of a form of derivative 
microscopy. In that picture, we have displayed the derivative of the 
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acoustic amplitude. This produces an image which is a micro-contour 
map of the substrate. 


IV. CONCLUSIONS 


We have successfully operated a scanning acoustic microscope at 
60°C with resolution approaching the diffraction limit using water as 
a transmitting medium. Our instrument operates at 1.9 GHz with a 
resolution of ~1 zm. We have shown some preliminary acoustic micro- 
graphs indicating how our present instrument might be used in studies 
of structural defects in various materials of technological importance. 

Work is currently under way to develop an instrument using liquid 
He’ as an acoustic transmitting medium, which will allow us to use 
much higher acoustic frequencies. Such a device will be the first 
acoustic microscope that will not have diffraction-limited resolution. 
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A Statistical Model of Multipath Fading on a 
Space-Diversity Radio Channel 


By W. D. RUMMLER 
(Manuscript received February 22, 1982) 


The joint probability of occurrence of frequency-selective fades on 
a pair of spatially separated receiving antennas is modeled for a 
typical line-of-sight microwave radio path in the 6-GHz band. The 
model was developed from observations of the transmission in a 24.2- 
MHz band during all multipath-fading occurrences in a 30-day 
period on a 26.4-mile path. By fitting the observations of every scan 
on both antennas with a simplified three-ray channel modeling 
function, the joint transmission at each observation is characterized 
by six parameters, three for each antenna. The joint occurrence of 
these six parameters is described by simple statistical distribution 
functions, allowing one to associate with any pair of channel trans- 
mission shapes the fraction of a year, or number of seconds in a year, 
that such a channel state will be encountered. The model represents 
the frequency selectivity or shape of the fades on the two antennas as 
statistically independent. Only the average fade levels on the two 
antennas are statistically related. Either antenna is more likely to 
experience a fade deeper than the median when selectivity is observed 
on it or when the other antenna is experiencing deeper fading than 
the median. The (marginal) statistics of fading on each of the anten- 
nas separately, as derived from the diversity model, are essentially 
the same as those described by a nondiversity statistical fading 
model, which has been used successfully to predict the multipath 
outage of digital radio systems. The model developed here will allow 
performance to be estimated in a diversity configuration. 


l. INTRODUCTION 


Occurrences of multipath fading limit the performance quality of 
high-speed digital radio systems operating on line-of-sight microwave 
radio paths. Extensive field measurement programs have been imple- 
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mented to evaluate the performance of a number of digital radio 
systems operating in different configurations under various condi- 
tions.’ These studies have indicated the universal need for some form 
of dynamic channel equalization, and for space diversity reception, on 
many paths, to meet performance requirements. While field measure- 
ments provide a good means of evaluating the operation of radio 
systems, they require considerable time, effort, and expense. Further- 
more, they suffer from the vagaries of nature in that multipath fading 
is a randomly occurring phenomenon with variable characteristics 
from month to month.*® 

To reduce the need for field measurements, a statistical model of 
multipath fading was developed.*’ This model, used in conjunction 
with characterization measurements performed on a radio in the 
laboratory, allows predictions to be made of the system performance 
under multipath conditions when operating in a nondiversity configu- 
ration.* ° The results presented here extend the previous work by 
providing a statistical model for multipath fading in a space-diversity 
configuration. 

The data base used for modeling was obtained by transmitting a 
wideband (8-PSK digital radio) signal at 6 GHz over a 26.4-mile path 
from Atlanta to Palmetto, Georgia. The received power at a number 
of frequencies in a 24.2-MHz band was measured simultaneously on 
both a horn reflector and a parabolic dish antenna separated by 30 
feet. Spectra were observed at rates up to five times per second during 
the occurrences of multipath propagation in a 30-day period in August 
to September, 1977. The received voltages on both the horn and dish 
at each observation, relative to unfaded or free-space propagation 
conditions, are represented as a function of frequency by the simplified 
three-path modeling function that has the form 


H(j2rf) = afl — be?" 0"), (1) 


The diversity channel model provides a joint statistical representation 
of the occurrence of the parameters of the function (1) as fitted for 
both the horn and dish. 

The choice of a modeling function for representing selective multi- 
path fades over a restricted frequency band is not unique. Such a 
function needs only to be capable of representing the characteristics of 
a multipath fade. The parameter statistics will depend strongly on the 
choice of function. Greenstein and Czekaj'' have used a complex 
polynomial in frequency to represent multipath fades and have devel- 
oped statistics for the coefficients of the polynomial for a nondiversity 
data base. Although other modeling functions have been proposed, ’?"’ 
none has been successfully represented on a statistical basis. The 
modeling function (1) used here has the virtues of providing an 
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excellent representation of the observed multipath fades and of being 
convenient for synthesis in the laboratory for the stressing of radio 
systems for performance appraisal. In the present work we show that 
a further advantage of this function is that the joint statistics in a 
diversity configuration are well behaved and easily represented. 

The statistical channel model is summarized in Section II. The data 
base used and the fitting of observations with the modeling function 
are described in Section III. In Section IV we provide the methodology 
for developing and verifying the statistical model. Concluding remarks 
are provided in Section V. 


ll. MODEL SUMMARY 
2.1 Modeling function 


During multipath fading, the voltage transfer functions of the paths 
to the horn reflector and to the parabolic dish antenna are modeled by 


Hn(j2nf) = an{1 — bye?" on] (2) 
and 
Hp(j2nf) = an{1 — bpe 2" ho!"7, (3) 


respectively. These transfer functions are measured relative to the 
unfaded, or free-space, transfer functions, which are both taken as 
unity at all frequencies, f. For convenience and for consistency with 
previous work we fix the delay 7 at 6.3 ns. These functions may be 
interpreted as the responses of channels with direct transmission paths 
with amplitudes ay and ap, and second paths with relative amplitudes 
by and bp, both respectively. The second path in each case has a 
relative delay of 6.3 ns, and a phase of 27font + 7 and 27fop + 7 
(independently controllable) at the center frequency of the channel. 

A typical plot of the attenuation produced by such modeling func- 
tions is shown in Fig. 1. The a and b parameters control the depth and 
shape of the simulated fades, respectively. The parameters fox and fop 
determine the frequencies of the transmission minima, or notches, of 
the simulated fades. With a simulated minimum within the channel, 
the modeling functions can simulate a wide range of levels and notch 
widths. With a simulated minimum out of the channel band, the 
modeling functions can generate a wide range of combinations of 
levels, slopes, and curvatures of the in-band responses. 

For convenience we work with the following related parameters: the 
fade-level parameters (in decibels) 


Ay = —20 log ay (4) 
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AH(j2a7f) = al1-be /27 (Ff) 7] 





A= =0 LOGa 
= 
g 
f-fg IN MEGAHERTZ 
Fig. 1—Attenuation of channel modeling function. 
and 
Ap = —20 log ap; (5) 
the relative notch depth parameters (in decibels), 
By = —20 log(1 — bx) (6) 
and 
Bp = —20 log(1 — bp); (7) 


and the notch frequency parameters, which we measure in degrees, 


ou = 360fouT (8) 
and 
op = 360fopr. (9) 


With 7 equal to 6.3 ns, one degree in ¢ corresponds to 0.44 MHz. We 
measure notch frequencies, ¢, from the center frequency of the channel, 
so that ¢ covers the range from —180 to +180 degrees, corresponding 
to the 158.4-MHz period of the functions (2) and (8). 


2.2 Parameter statistics 


The number of seconds in a year that the six parameters (Au, Ap, 
Bu, Bo, on, op) are in a differential element of the six-dimensional 
parameter space is shown to be given by 
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T(An, Ap, Bu, Bp, ou, op) 

= Topa/s(Aun, Ap/Bu, Bo)ps,(Bu)ps,(Bo)pulou)polop), (10) 
where the functions p(-) are all probability density functions. The 
time scale factor 77>, under the assumption that events scale with the 


classical scaling of the incidence of multipath fading,” is given by the 
expression: 


To = 52800e( f/6) (D/25)° (11) 
where 


f is the frequency in GHz, 
D is the path length in miles, and 
c is the terrain factor, varying between 0.25 and 4. 


The probability density functions of horn and dish relative notch 
depths are given, respectively, by 


ps,(Bu) = 0.76711(2B x) (0.10258) e~°-10258B% 
+ 0.23289(0.23281)e-°?"!4# = (12) 


PB,(Bp) = 0.82295(2B p) (0.07668) e-°.07668B2 
+ 0.17705(0.21786)e~°717858>_ (13) 


The joint probability density function of Ay and Ap is conditioned on 
the values of By and Bp and is given by 


1 —1 (An — gx)” 
pa/p(Au, Ap|Bu, Bp) 7 ae exp Tre) —= 
_ 2p(An — 8n)(Ab — 8p) Fs (Ap — | aa) 
OHOD oF 
where 
&u = gn(Bu) = 23.956(701.11 + Bx) /(1320.6 + Bi) (15) 
&p = gp(Bp) = 27.139(1223.8 + Bh) /(2650.9 + Bh) (16) 
and 
ou = 6.8268 
Op = 7.0272 
p = 0.64995. (17) 


The probability density functions for the horn and dish notch frequen- 
cies are given, respectively, by 
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—_—_ | ox| < 90 
pulon) =4 198° (18) 
080 90 <= |x| < 180 
and 
: lon| < 90 
paeraas D 
po(gn) = 4 1°70 (19) 


2.3 Interpretive discussion 


The salient features of the model, which are verified in Section 4, 
are easily stated. The selective components of multipath fading as seen 
by the horn and the dish are modeled as independent processes. This 
means that detailed knowledge of the transmission “‘shape” present at 
a given instant on one antenna provides no information concerning 
the shape that will be present on the other. The only coupling between 
the fading on the two antennas is provided through the joint condi- 
tional A-distribution of (14). The form of this conditional probability 
density function implies that the fade-level parameters, or fading 
levels, on the two antennas are related. Deeper fades on one tend to be 
accompanied by deeper fades on the other. The conditioning on the 
relative notch depth parameters implies that the fade-level parameters 
depend on the fade shapes. This is similar to the coupling provided in 
the nondiversity model®” in that the fade-level parameter is correlated 
with relative notch depth, or that deeper fades are more likely to occur 
when shape is present. For the diversity model we find that the 
existence of a shapely fade on one antenna is more likely to be 
accompanied by deeper than average fading on the other. 

There are two important limitations of the proposed model. The 
most obvious limitation is that we have not explicitly represented the 
variability in fading statistics that would accompany changes in rela- 
tive spacing in the two receiving antennas. At first blush, we ascribe 
this to the existence of only one data base for the particular configu- 
ration tested; hence, the model is only valid for antenna spacings of 
thirty feet. Upon closer inspection, one notes that the only coupling | 
parameter that is unique to the diversity configuration is the parameter 
p in the joint conditional A-parameter distribution. (With p = 0, the 
model would factor into two independently fading probability models.) 
One could, in principle, relate p to antenna separation by calculating 
the single-frequency fade statistics of a simulated diversity switch as 
a function of p and compare these results with the known results for 
various antenna separations.’” While an extrapolation of the model to 
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Fig. 2—Experimental configuration for diversity propagation measurements, August 
8 to September 6, 1977. 


larger separations is straightforward, one would expect that for suffi- 
ciently small antenna separations the fade shapes observed on the two 
antennas would be correlated, leading to a more complicated model. 

The other model limitation results from the lack of phase informa- 
tion in the channel probing measurements. We model the channel as 
a minimum phase channel at all times; that is,” we choose a solution 
with 0 = 6 <= 1 and assume a minus sign for the fr term in the 
exponential in (2) and (3). This limitation makes it difficult to assess 
the characteristics of the combined signal for a continuously adaptive 
space-diversity combining algorithm. 


ill. PROPAGATION MEASUREMENTS AND THEIR REPRESENTATION 


3.1 Description of the propagation experiment 


The propagation measurements used in this study were obtained 
from an experiment conducted on a 26.4-mile path from Atlanta to 
Palmetto, Georgia, during the period from August 8 to September 6, 
1977. Many of the parameters of the experiment are summarized in 
Fig. 2. The radiated signal source was a general trade 78-Mb/s, 8-PSK 
digital radio operating at a nominal center frequency of 6034.2 MHz. 
The signal was received at Palmetto on both a standard horn reflector 
and a 10-foot diameter parabolic dish located 30 feet below the horn. 
The spectral energy received by each antenna was measured at 12 
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frequencies separated by 2.2 MHz and spanning 24.2 MHz. The receiv- 
ing filter at each of these frequencies had a 200-kHz bandwidth. 

During fading activity the received power at each frequency on both 
antennas was measured either five times a second or once every two 
seconds, depending on how rapidly the channel was changing. Sampled 
power, quantized in 1-dB steps, was recorded by the Multiple Input 
Data Acquisition System (MIDAS), constructed by G. A. Zimmerman. 
During nonfading periods the power was recorded at a rate of once 
every thirty seconds. Based on a two-hour measurement period span- 
ning noon on each day, free-space, or nonfaded, received power levels 
were determined for each frequency. 


3.2 Diversity data base 


Over the duration of this experiment, multipath fading occurred in 
fourteen separate time periods. The measurements made in each of 
these time periods were calibrated and collected into a computer data 
base for further analysis. The overall data base includes 85,410 scans 
of both the horn and the dish, and encompasses 44,386 seconds of 
fading activity. 

Although propagation was monitored for approximately one month 
in the heavy fading season for this path, the observed fading activity 
was about twice the amount that would be expected in a typical heavy 
fading month. Figure 3 shows the time-faded statistics for the horn; it 
shows the number of seconds that power was faded to or below the 
level specified by the abscissa. (For purposes of analysis, it is assumed 
that measured or calculated parameters hold a constant value from 
one observation time instant until the next observation time instant at 
which the value may make a stepwise change.) The four curves shown 
represent the power measured at a frequency near the upper end of 
the frequency band, one near mid-band, and a third near the lower 
edge of the band. The fourth curve represents the fading of the average 
power, based upon a wideband measurement of the received signal. 
We observe that the four curves are virtually coincident down to the 
40-dB level, where the rms power characteristically rolls off more 
rapidly. The coincidence of the curves indicates a good mix of fading 
events with no dominant events causing an excess of fading activity at 
any particular frequency in the band. 

Also shown in Fig. 3 is the fading activity predicted for this hop in 
a heavy fading month.’® The observed fading statistics match the 
predicted L” slope of the predicted curve; however, for this period we 
have obtained twice as much time at a given level as one would expect 
in a heavy fading month. 

Figure 4 shows the time-faded statistics as observed on the dish 
antenna. The general observations made for the horn apply here also, 
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Fig. 3—Time-faded statistics of received power on the horn reflector. 


except that we note that the lower frequencies in the band show more 
fading activity than the upper frequencies at fade levels near 40 dB. 
The effects of this will become more apparent in Section 4.4. Compar- 
ing the dish fading statistics with the predicted fading curve, we would 
expect the observed fading activity in 2.5 heavy fading months. 

As a further consistency check, we consider the in-band power 
difference (IBPD) statistics, which have also been used for sizing data 
bases of fading observations.’° When fading is monitored at a number 
of frequencies in a band, one can characterize the transmission shape 
of an observed channel by IBPD, which is the difference, in decibels, 
between the largest and smallest attenuation of the observed frequen- 
cies at a given time. Figure 5 shows the IBPD statistics for the horn and 
the dish; that is, it shows the number of seconds that the IBPD equaled 
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Fig. 4—Time-faded statistics of received power on the dish antenna. 


or exceeded the value specified by the abscissa. As a reference curve 
we also show, in Fig. 5, the IBPD curve (see Ref. 8, Fig. 20) derived 
from the data base used for the nondiversity model’ as scaled to a 
heavy fading month. The reference IBPD curve was derived from 23 
observed frequencies spanning a 25.3-MHz band. Although the IBPp 
ascribed to a given channel condition depends upon the frequency 
band spanned by the observations and, to a lesser extent, on the 
number of frequencies observed, the bandwidth difference is small, 
and the effects of frequency spacing may be minimized by concentrat- 
ing on the more modest values of IBPD, 5 to 10 dB. Over this range of 
IBPD there are 1.8 to 2.3 times as many seconds at a given level for the 
horn, and 2.3 to 2.7 times as many seconds for the dish. The midpoints 
of these ranges are very nearly equal to the scaling factors determined 
previously from the time-faded statistics. 
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Fig. 5—Time-faded statistics of in-band power difference (IBPD) of the horn and dish. 


Splitting the difference between the 2.0 and 2.5 months-of-fading 
estimates for the horn and dish, respectively, the data base is taken as 
representing 2.25 months of multipath fading. On the basis that three 
heavy fading months are equivalent to the fading activity in a year, we 
take the data base as representing 0.75 of the expected annual multi- 
path-fading activity for this path. 


3.3 Representation of spectral measurements 


Each scan of the received power levels of the horn and dish is 
represented by the channel transfer functions (2) and (3), respectively. 
To obtain the parameters of the functions for each scan, we fit the 
squared magnitudes of the functions to the received powers. The 
fitting procedure minimizes the weighted mean squared error between 
the observed power and the estimated power. The weighting function 
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Fig. 6—Representation of fades observed on the horn and dish on August 21, 1977, at 
1:28:3.2. 


is proportional to the reciprocal of the square of the received power at 
each frequency. This provides a weighting that is approximately log- 
arithmic, to match the instrumentation errors which are independent 
and approximately Gaussian on a logarithmic scale. The reader is 
referred to Ref. 7 for additional details of the fitting procedure. 
Figure 6 shows the fits to horn and dish scans observed concurrently. 
At this particular instant there was a notch present on the horn at a 
frequency of 4.4 MHz below the band center, and a 6-dB slope present 
on the dish. The parameter values producing the fitting functions are 
shown on the plots. The rms error between the observed levels and 
the values of the fitted function at these frequencies may be taken as 
a measure of the quality of the representation of the fade. For the horn 
scan shown, the rms error is 0.68 dB, for the dish 0.50 dB. These values 
are typical for the measurement system. The power measurement at 
each frequency has associated with it a fluctuation, or noise, that is 
additive and approximately Gaussian on a decibel scale. This noise is 
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Fig. 7—Distribution of rms fit errors. 


independent both frequency-to-frequency and scan-to-scan, and is 
large enough (0.6 to 0.7 dB) to mask quantization errors. 

If the differences between measured and estimated powers were due 
solely to Gaussian fluctuations with a standard deviation, o, the 
quantity 12E7ns/0” would be a x” random variable with nine degrees of 
freedom,” where E,n; is the rms error in fitting a scan. Thus, one can 
determine the quality of the modeling by comparing the distribution 
of the values of Ems for all the horn or dish scans with that of a x’ 
variable. The distribution of the horn and dish rms errors are plotted 
in Fig. 7 along with the x’ distributions for several o values. The excess 
error, the difference between a sample distribution and a x’ distribution 
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that matches it at the median, is more than 0.1 dB for less than 1 
percent of all scans. Because the time between scans is not identical, 
there is no precise method of scaling the percentages in Fig. 7 to 
seconds per year. However, an approximate scaling is achieved by 
applying the percentage to the time covered by the data base and 
interpreting the resultant time as representing 0.75 of a year. On this 
basis the excess error in fitting the dish data exceeds 0.5 dB for about 
40 seconds a year. For the horn data the excess error is less than 0.4 
dB over the entire data base. We conclude that the fitting is excep- 
tionally good for the horn and better than average’ for the dish. 


IV. VERIFICATION OF MODEL STATISTICS 
4.1 Overview 


We shall begin our discussion with a description of the general 
modeling problem. By drawing on the properties of probability density 
functions and on past experience in modeling-selective fading, we will 
simplify the problem, somewhat. We conclude this subsection with a 
statement of the objectives of the remainder of the section. 

By representing each scan of the horn and the dish with (2) and (3), 
we obtain a reduced data base consisting of 85,410 sextuples of values 
of (An, Ap, Bu, Bp, $x, op). Each of these sextuples has associated 
with it a time weighting corresponding to the time interval until the 
next scan in the same fading event. We wish to describe this data by 
a function, Tp(Au, Ap, Bu, Bo, 6x, op), whose values are equal to the 
number of seconds the six parameters were in a differential element of 
the parameter space, centered on the point (Ay, Ap, Bu, Bo, on, $v). 
Normalizing Tp to the data base time span, we obtain a probability 
density function, 


P(Au, Ap, Bu, Bo, $n, $v) = Tp(An, Ap, Bu, Bo, on, on) /44386. (20) 


It is this probability density function that we wish to determine. We 
will ultimately show that it may be approximated by the product of 
the probability functions in (10). 

To simplify (20), let us first rewrite it as 


P(An, Ap, Bu, Bo, on, op) 
= DaBjs(Au, Ap, Bu, Bo|ou, $p)ps(ox, op). (21) 


In previous work where multipath fading on a single antenna was 
statistically modeled,®” it was found that the notch frequency statistics 
were not related to the relative notch depth statistics or to the fade- 
level statistics. An examination of the current data base revealed the 
same properties, that is, the statistics of ¢y do not depend on those of 
Az or Buy, those of ép do not depend on those of An or Bp. Under the 
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assumption that a cross-coupling (between gx and Ap, for instance) is 
even less likely, it was assumed at the outset that (21) can be written 
as 


P(An, Ap, Bu, Bp, ou, op) = Pas(An, Av, Bu, Bo)ps(hu, bv). (22) 
We rewrite (22) as 
P(Au, Ap, Bu, Bop, $n, $0) 
= pa/p(Au, Ap|Bu, Bo)ps(Bu, Bo)pslou, op). (23) 


In the remainder of this section we shall derive the functional form 
of each of the probability density functions in (23). The ultimate 
objective is to show that (23) can be represented by the factors 
multiplying To, on the right-hand side of (10), with the various prob- 
ability density functions as defined in (12) to (19). To this end, we 
consider the joint distribution of By and Bp in Section 4.3; we show 
that it can be represented as the distribution of two independent 
variables with distributions given by (12) and (13). The form of the 
conditional distribution pa;s(An, Ap| Bu, Bp), as given in (14) to (17), 
is derived in Section 4.3. In Section 4.4, we consider the notch fre- 
quency distribution and show that it can be modeled by independent 
random variables as given by (18) and (19). 

As part of the process of developing a multidimensional statistical 
model one must make many choices that may seem arbitrary. How- 
ever, we have proceeded with the philosophy that we should represent 
the data well whenever there is a significant degree of fading present 
in either antenna. To accurately represent the most severe events, we 
must develop our cumulative distribution functions from the more 
severe to the less severe fades, i.e., the complement of the usual 
cumulative distribution function. Our goal is to find the simplest 
probability functions that match these distribution functions, where 
we define simplest functions as those having the fewest possible 
number of parameters. In assessing how well these objectives are 
achieved, we view the composite data base as a member of an ensemble 
of all possible data bases. Thus, the parameters obtained from the data 
base must be considered as random variables of this ensemble of fading 
events. 


4.2 Notch depth statistics 


The objective of the diversity model is to accurately represent the 
transmission shapes present on both the horn and dish at any time 
that deep or shapely fading was present on either. In previous work 
with the nondiversity model it was only necessary to represent the 
notch depth parameter at values large enough to produce several 
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Fig. 8—Contour plot of #(Bxz, Bp), smoothed fit to cumulative joint distribution 
function of horn and dish relative notch depths. 


decibels of shape in the band. For the diversity model considered here, 
we must represent the joint distribution of horn and dish notch depth 
parameters at all values. (While one could give less importance to the 
distribution in regions where both notch depth parameters are small, 
this is found to be unnecessary.) We first develop the complement of 
the cumulative distribution of the sample values of the horn and dish 
relative notch depths, By and Bp, respectively. We define this two- 
dimensional function, F(x, y), as the number of seconds in the data 
base that By equaled or exceeded a value x, and Bp equaled or 
exceeded y: 


F(x, y) = Number of seconds: By = x, Bp= y. (24) 


To provide a focus for the ensuing discussions, we develop the 
function F(x, y), which is a smooth function fitted to the multiply- 
discontinuous function F(x, y). Figure 8 shows a contour plot of the 
surface F(x, y). It shows, for instance, that there are fewer than 20 
seconds in the data base during which both the horn and dish notch 
depths simultaneously exceed 10 dB. There are fewer than 5 seconds 
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for which the notch depth on one antenna equals or exceeds 15 dB, 
while that on the other is 10 dB or greater. 

Figure 8 was derived by first determining the values of the function 
F(x, y) on a square grid of points, x;, y;, where 


ee ee eee 
’ i-0.5 i>1 


f 0 j=s1 
Yi )F_ 05 >. 


(25) 


Since F(x, y) falls off approximately exponentially with increasing 
values of x and y, we approximate it with F(x, y), where 


F(x, y) = exp (- y ann"9"). (26) 
m+n=N 

The coefficients @nn in (26) were determined by minimizing the mean 
square error between In F(x, y) and In F(x, y) over all x;, y j, less than 
24 dB, for which there were five or more seconds in the data base, 
F (xi, y;) = 5. [The 5-second limit was chosen to avoid the region of 
the x, y plane where data is becoming sparse, causing the sample 
function, F(x, y), to have increasingly extensive flat areas.] Figure 
8 shows the equal value contours of F(x, y) as defined by (26) with 
N = 6, for which 28 parameters, @mn, were determined. 

The function F'(x, y) with N = 6 provides an excellent representation 
of the sample function, F(x, y); the rms error between In F(x, y) and 
In F(x, y) is approximately 0.092, which corresponds to an rms error 
of 9.2 percent over the fitting region. While one can reduce the fitting 
error by increasing the dimension, N, of F(x, y), the reduction is not 
great. The minimum error, 6.7 percent, is obtained with N = 9 (55 
coefficients, @mn). Furthermore, the fitted functions, the F'(x, y)’s, lose 
the appearance of distribution functions for N greater than 6. Note 
that the distribution function, F(x, y), must satisfy the inequality 


F(x, y)= F(x’, y’) for x’2x,y'2y. (27) 


It may be seen that F(x, y) for N = 6, as shown in Fig. 8, violates this 
inequality for Bp near 1 dB and By greater than 12 dB. 

An extensive study was undertaken to find a distribution function 
that would fit F(x, y). While a function with a polynomial exponent 
such as F (x, y) can be fitted by solving a system of (N + 1)(N + 2)/2 
linear equations, the more general distribution functions were fitted 
using a modified gradient search routine.”’ For practical reasons the 
class of functions was limited to those having no more than eight 
parameters. Of those tried, the best was of the form 


Fry(x, y) = zo(e72* + ze 22") (e724 + zee”), (28) 
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Fig. 9—Contour plot of Fi(Bu, Bp), model cumulative joint distribution function of 
horn and dish relative notch depths. 


For the parameter values corresponding to those in (12) and (13), the 
rms fitting error was 18.7 percent. The contour plot of this function, 
shown in Fig. 9, is seen to match the smoothed function of Fig. 8 quite 
closely. To show how well F(x, y) matches the original sample 
distribution, we plot the distribution of horn notch depth conditioned 
on dish notch depth in Fig. 10, along with the values of the sample 
distribution, F'(xi, y;), being matched. A similar plot of the distribution 
of dish notch depth conditioned on horn notch depth is shown in Fig. 
11. 

Comparing Figs. 8 and 9, we see that the model function, F'x7(x, y), 
has most of the properties of the function F'(x, y). From Figs. 10 and 
11, one obtains an appreciation of the irregularities in the trends in the 
distribution of the data points; these irregularities contribute substan- 
tially to the fitting error. While one could argue that the modeling is 
acceptable on the basis of Figs. 8 to 11, there are more compelling 
reasons for accepting this distribution function as representing the 
data distribution, as we shall outline in the following paragraphs. 

If the function used to represent the data distribution, F(x, y), can 
be factored as a product of a function of x and a function of y, the 
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resulting probability model will be one in which By and Bp are 
statistically independent. We will first show that the modeling func- 
tion, F(x, y), provides close to the minimum achievable error under 
the factorability assumption. Subsequently, we shall show that there 
is no reasonable alternative. 

If the sample distribution were factorable, we could represent it 
exactly by 


N N 
Fr(x, y) = co | exe(- > ani") || exo( - x bn") | (29) 


since we are only representing the sample distribution at a finite 
number of points. Fitting a function of the form (29) to F(x, y), as 
described previously, we find that the minimum rms error is 17.4 
percent for N = 7, which corresponds to 15 terms. For larger dimen- 
sionality the error increases, presumably, because of loss of precision 
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in the double precision solution to the set of linear equations for the 
coefficients. The model solution achieves an rms error of 18.7 percent 
with fewer than half the number of coefficients. 

As a check of the ruggedness of the model solution, we varied the 
region over which the model function was fitted. If the upper limit of 
x and y is reduced from 24 to 22 dB, the parameters change by less 
than one percent. If the region where both x and y are less than 5 dB 
is removed, the change is even smaller. We conclude that F(x, y) 
provides an accurate and stable approximation to F(x, y). 

There is no way of testing whether By and Bp are statistically 
independent. One check, which we can apply, is to estimate their 
correlation coefficient. By examining the data base, we find that the 
coefficient of correlation between By and Bp is 0.0306. While there are 
standard tests for the significance of correlation coefficients,” they 
pertain to the case of independent sets of samples, whereas the time 
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series samples of the fading parameters we are working with are 
correlated.”* In the appendix, we show that the effective number of 
independent time samples of By and Bp in the data base is approxi- 
mately 1516, and that one would expect a correlation of this magnitude 
(0.0306) or greater about 27 percent of the time for this sample size. 

Various subpopulations of the B-parameters were also examined for 
correlations. For instance, consider the set of By and Bp observations 
for which both were equal to or greater than 6 dB. The correlation 
coefficient is —0.118 for this subpopulation. It is shown in the appendix 
that a correlation coefficient of this magnitude or greater would be 
expected to occur 53 percent of the time. 

Thus, we have based our choice of the model function F(x, y) on 
the following grounds: (i) the model function (Fig. 9) captures the 
essential morphology of the sample distribution (Fig. 8); and (iz) no 
candidate distribution function providing correlation between By and 
Bp and employing a similar number of parameters represents the 
distribution function as well. While the rms error between the model 
distribution and the sample distribution is considerably larger than is 
that between the best functional representation, F(x, y), and the 
sample distribution, this is to be expected because F(x, y) has many 
more degrees of freedom and is not constrained to be a distribution 
function. In other words, F (x, y) represents the data within the region 
of interest by following all minor irregularities; immediately outside 
this region, this function exhibits large amplitude oscillations. By 
examining the correlation of By and Bp for various subpopulations we 
have established that there are correlations within these subpopula- 
tions. These correlations correspond to variations in the sample distri- 
bution surface that a factorable function, such as the model function, 
is incapable of matching, but which have been shown to be without 
significance. We conclude that there is no basis for choosing a more 
complicated function than (28) for representing the sample joint dis- 
tribution. 


4.3 Fade-level statistics 


For the nondiversity fading model, the fade level or A-distribution 
was Gaussian with a mean dependent on the relative notch depth. A 
generalization of this, a two-dimensional Gaussian probability density 
function, describing the joint probability of Ay and Ap conditioned on 
Bu and Bp would be given by (14), with gu, gp, ou, op, and p being 
functions of By and Bp. To obtain, for the diversity model, a proba- 
bility density function that will easily reduce to that of the nondiversity 
model, we assume that gy depends only on Buy, gp depends only on 
Bp, and that on, op, and p are independent of both Bx and Bp. 

As the first step in verifying this hypothesis, we must determine the 
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Fig. 12—Mean and standard deviation of Ay, fade-level parameter for the horn, as 
modeled and estimated for horn relative notch depth in 1-decibel intervals. 


functional dependence of the means, gy and gp, on their respective 
variables, By and Bp. We do this by estimating the value of, say, gu at 
a set of values of Bux, and fit a function to these sample means. The 
value of g7(x) is the average value of Ay for Bx equal to x. We estimate 
&u(x) by taking the expectation, or average value of Ay in the data 
base for all times that By is between x — 6 and x + 6. Specifically, we 
work with 1-dB intervals and estimate gu by 


2H A) = E{An: x5 Bu S xin}, (30) 
where EF {-} denotes expectation, and the x,’s are defined by (25). 
The values of 84(x) are indicated by squares in Fig. 12, which also 
shows the function gx(By) of (15), which we use as the conditional 
mean of Ay in the model. We use a meromorphic function, gxu(Bu), to 
represent this conditional mean to ensure that it approaches a constant 
at large values of horn notch depth. Note that the accuracy of the 
estimates of 4 decrease at large values of the horn notch depth 
because the number of samples decreases. (The approximating func- 
tion gy was obtained from a weighted least-squares fit to the estimates, 
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Fig. 13—Mean and standard deviation of Ap, fade-level parameter for the dish, as 
modeled and estimated for dish relative notch depth in 1-decibel intervals. 


&u. The weighting was in proportion with the square root of the 
number of seconds of data in the notch depth interval.) 

Figure 12 also shows the results of estimating the standard deviation 
of Ay conditioned on By in the same set of intervals. The straight 
line represents the (unconditional) standard deviation (6.8268) of 
Aun — gx(Bu) for the whole data base. Figure 13 shows the results of 
duplicating for the dish parameters the calculations leading to Fig. 12. 

It is a simple matter to test the validity of the hypothesized model. 
If Ax — gu(Bu) and Ap — gp(Bp) are jointly Gaussian with zero 
means, correlation p, and respective standard deviations oy and op, 
they may be linearly transformed into a pair of zero mean, unit- 
variance, independent, Gaussian random variables. We shall develop 
the transformation in two steps: a rotation of axes, followed by a scale 
change. Taking advantage of hindsight, we plot in Fig. 14 contours of 
the joint probability density for Ay and Ap of (14). The rotated axes 
x and y are defined by the transformation: 


tel cos @ sin@ || Ay — gu(Ba) (31) 
y|  |-sin@ cos@||Apn-— gn(Bo) |" 
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Fig. 14—Equal value contours of the joint conditional probability density function 
for fade-level parameters, Az and Ap, showing variable transformations. 


The chosen angle, 6 (= —43.7 degrees), satisfies the relation 


2 
tan 20's (32) 
OH—- OD 


which ensures that x and y are uncorrelated. Their variances are given 
by 

o2 = 07;c0s70 + odsin’6 + 2pcyapsin 8 cos 8 (33) 
and 

o? = o7sin’6 + oDcos’6 — 2ocxyopsin 8 cos 8, (34) 


respectively. From the variances and correlation in (17) and the value 
of @ given above, we calculate the values of oz and o, as 4.097 and 8.900, 
respectively. Note that the major axes of the ellipses of concentration 
shown in Fig. 14 lie along the y-axis, and the minor axes along the x- 
axis. By rescaling the x- and y-axes we obtain the desired zero mean, 
unit variance, independent variables, u and uv, as 


u|_|1/ox 0 x 
Deke i 


2208 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1982 


0.9999 


DECILE 
SYMBOL OFu 


0.999 


0.995 
0.99 


oon noo WN = 


0 
© 
A 
+ 
x 
© 
zs 
x 
Zz 
Y 
ys 


PROBABILITY THAT v EQUALS OR EXCEEDS ABSCISSA 
° 
a 
ro) 





STANDARD DEVIATIONS 


Fig. 15—Distribution of canonic sum parameter, v, conditioned on deciles of canonic 
difference parameter, u. 


One may generate the sample distribution of u and v from the data 
base by calculating their values at each scan using eqs. (15) to (17) and 
(31) to (85). The conditional cumulative distribution functions of these 
two variables are shown in Figs. 15 and 16. Each plotted curve 
represents the cumulative distribution function of all values of one of 
the variables conditioned on the other variable being in a given decile 
of a Gaussian distribution; e.g., the first decile of u contains all u values 
that are less than —1.28155. Figure 15 shows the distributions of u 
conditioned on vu; Fig. 16 shows v conditioned on u. Both families of 
distributions are closely grouped and approximately Gaussian within 
the range of —3 to +3 standard deviations. This is remarkably good 
agreement for a sample of this size. 
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Fig. 16—-Distribution of canonic difference parameter, u, conditioned on deciles of 
canonic sum parameter, v. 


Figures 15 and 16 provide good confirmation of the assumption that 
Ay and Ap are jointly Gaussian variables. The only assumptions that 
were studied further were those relating to the functional form of gx 
and gp. The procedures outlined above were carried through under 
the assumption that gy and gp were both functions of both By and 
Bp. The resultant u and v conditional distribution functions were not 
noticeably different from Figs. 15 and 16. The only notable difference 
was that the value of oy was reduced from 6.8268 to 6.6815, that of op 
from 7.0272 to 6.954, and the correlation coefficient increased from 
0.650 to 0.681. This trivial difference in the coefficients would be 
achieved at considerable cost in complexity because the probability 
density function, (14), would become extremely difficult to use. The 
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effect of this generalized assumption is not pronounced on the functions 
gu and gp. For instance, the fractional variation of g7(Bu, Bp), as Bp 
varied its values along any line of constant Bu, was on the order of 10 
percent. Hence, the development of a more complicated distribution 
cannot be justified. 


4.4 Notch frequency statistics 


As a first look for dependency between the horn and dish notch 
frequency parameters, the correlation coefficient between these two 
variables, dy and op, was determined from the data base to be —0.0281. 
Using the techniques described in the appendix, one would expect a 
correlation magnitude larger than this value to occur 7.3 percent of 
the time in a data base of this size if gy and ¢p were statistically 
independent. While this correlation is small, it is large enough to be 
considered “almost significant” and to warrant a more detailed study. 

The simplest way to look for the existence of any interdependency 
between the horn and dish notch frequencies is to plot distributions of 
one conditioned on the other. For instance, one chooses all those time 
intervals when the value of dp was between —85 and —65 degrees (or 
fop was between 37.4 and 28.6 MHz below the center frequency of the 
channel). One then plots the fraction of this time interval (2615 
seconds) that the horn notch frequency exceeded a given value. The 
resulting conditional distribution is labeled by octagons on the com- 
posite plot of Fig. 17. The other distributions in Fig. 17 were obtained 
by conditioning the notch frequency on other intervals of dish notch 
frequency, as indicated. For reference, the span of the channel mea- 
surements is from —27.5 to +27.5 degrees (412.1 MHz). 

Since the overall spread of this family of distributions is small (less 
than 10 percent), over the range of 64, we represent the entire family 
by a single distribution, (18), shown dashed. Thus, we describe the 
distribution as uniform at two levels with values of |¢z| less than 90 
degrees being five times as likely as values greater than 90 degrees. 

Figure 18 shows a set of distributions of the dish notch frequency 
conditioned on the horn notch frequency. This family of distributions 
is very tightly clustered, implying that the horn notch position had no 
influence on the distribution of dish notch position. The family of dish 
notch distributions does not fit the two-level uniform approximation, 
shown dashed in Fig. 18, as well as does the horn data. The relatively 
large deviation (about 15 percent near —30 degrees) between the data 
and the modeled distributions results from an asymmetry in the data, 
where, on a physical basis, none should be found. For atmospheric 
multipath, one expects transmission notches to be equally likely at any 
frequency in the neighborhood; hence, the notch frequency probability 
density functions should be symmetric and the cumulative distribution 
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Fig. 17—Distribution of horn notch frequency conditioned on dish notch frequencies 
in specified intervals. 


functions shown should be antisymmetric about the (0 degree, 50 
percent) point. 

In the central 60-degree region of the horn notch frequency distri- 
bution, the distributions conditioned on positive dish notch frequencies 
cluster separately from those conditioned on negative frequencies. 
This is an integrated effect, in that it is not apparent in the conditional 
probability density functions (not shown). There may be some relation 
between this spread in the horn notch frequency distribution, the 
spreading of the time-faded statistics for the dish (Fig. 4), and the 
symmetry properties of the observed dish notch frequency distribution 
(noted in Fig. 18). However, these differences are small and no attempt 
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Fig. 18—Distribution of dish notch frequency conditioned on horn notch frequencies 
in specified intervals. 


was made to isolate the events giving rise to these effects or to 
incorporate them into the model. 


V. CONCLUDING REMARKS 


We have provided in Section II a statistical model of multipath 
fading as observed in a diversity configuration over an extended period 
of time. Supporting evidence of the accuracy of the statistical model 
was presented in Section IV, along with a description of the method- 
ologies employed. The transmission path to each antenna was repre- 
sented by a function synthesizing a simplified three-path fade. While 
one could statistically represent the joint occurrence of the parameters 
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of these two functions in the observed data base with greater precision 
using a more complex model, all attempts to establish the significance 
of higher order extrapolations of the proposed statistical model have 
been negative. There would seem to be little virtue in representing, 
with a particular statistical model, features that would not be found in 
a fading data base corresponding to a different observation period. 

For the proposed model, the only correlation in the fading of the 
two antennas is in the level of fade. This simplicity of dependence is a 
direct result of the form of the function used to represent fading on 
the two antennas. More, or different, interdependencies might become 
apparent or significant with other fade representations. The limited 
dependence is a virtue of the model since it places all of the impact of 
diversity antenna separation in a single parameter, for separations in 
the range of practical interest. However, for sufficiently small separa- 
tion one would ultimately expect to see correlations in the shapes of 
the fades observed on the two antennas. 

The proposed model was only checked against the data; however, it 
reduces to the form of the nondiversity model, which has been exten- 
sively verified. A model representing fading on the horn is obtained by 
integrating the model statistics over the dish parameters, and vice 
versa. Some of the parameters of the horn and dish models derived 
from the diversity model are different from those that have been 
derived previously, but the differences are not great, on the order of 10 
percent, at most. As a consequence, one would not be surprised by 20 
to 30 percent differences between expectations calculated with this 
model and corresponding expectations calculated with the nondiversity 
model. This merely reflects the month-to-month and year-to-year 
variability in the nature and severity of multipath fading. 
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APPENDIX 


Correlation and Significance 


The purpose of this appendix is to develop the methods of testing 
for the significance of the correlation between two random variables 
when the sample values of each variable are taken from a time series 
with known autocorrelation. Consider two stationary, independent, 
zero mean, unit variance random processes, x(t) and y(t). Assume that 
we have samples of each at a large number, N, of time instants, where 
the ith time instant is taken as 


=iAt i=1,2,3,---,N. (36) 


We are interested in the correlation coefficient of x and y as determined 
from the set of samples x; = x(¢;) and y; = y(ti), where 
N 


=—Y x:y;. 37 
p w & ti (37) 


Because x(t) and y(t) are independent processes by assumption, the 
expected value of p is zero: 
a | 
PNG 





acs = 0, (38) 


iM = 


where the overbar implies an ensemble average. 
We wish to determine the variance of the estimate (37). We write 
the expected value of this sample variance as 
~ NN 
0, =p = a2 dD Rive; (39) 
t=1 j=1 
Under the hypothesis that x(¢) and y(t) are independent processes, we 
may rewrite (39) as 


1 NN a ; Sone 
a= WE SRI = >) , px(t — J)py(t — J), (40) 
i=1 j=1 i=1j=1 
where 
Px(t) = XeXK+i (41) 
and 
Py(t) = VeVari (42) 


are the autocorrelation functions of x(t) and y(¢), respectively, at time 
difference zAt. We may rewrite = as 


1 
Oo, = WN? [v+23 2 x (N — Boabex() | (43) 


If the time samples of x(t) and y(t) were independent, the autocor- 
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relation functions would be unity for k = 0 and zero elsewhere, that is, 


1 
=> 


Let us define an effective sample size, N.#, such that, for a given sample 
size, N, and given autocorrelation functions, p.(i) and p,(z), 


if px(t)=p,(i)=0 for i+¥0. (44) 











bes 45 
o ie (45) 
Then, 
N NY (N- 
WN ni425 ( *) px(B)py(R). (46) 
eff k=1 


Our object is to use (45) and (46) for the diversity data base. As 
noted in Section III, the data base is not uniformly sampled. Further- 
more, multipath fading is not a stationary random process. One may, 
however, define a lagged autocorrelation function for the samples of 
such a process” at delay 7 as 


(s724~~) — (G7) (72%) 
(Ge9)-Go-)T eG) 


where the sample denoted m+, is tT seconds delayed from the sample 
Xm, and the sums are taken over all M = M(r) pairs of samples in the 
data base with a delay difference of r. 

Figure 19 shows plots of (47) for: (a) the relative notch depth of the 
horn, and (b) the relative notch depth of the dish. Figure 20 shows 
M(r), the sample size for the data base, as a function of the delay, 7, 
for delays of integer numbers of seconds. Using these two figures as 
examples, we can approximate the quantity N/N.v. We describe the 
autocorrelation functions by 


px(t) = axe’ = axB% (48) 


px(T) = 


(47) 


and 
py(T) = aye!" = ayBy. ; (49) 


The samples in the data base are taken nonuniformly in a set of 
disjoint intervals. With 85,410 samples in 44,386 seconds we have an 
average sample spacing of 0.5 second. Hence, we approximate the 
uniform-sampling window function of (46), (N — k)/N, by taking an 
approximation to M(r)/N, as shown in Fig. 20 and given by 
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Fig. 20—Sample size for autocorrelation estimates in Fig. 19. 
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Using (48) to (50) with + = 0.5 k corresponding to the 0.5 second 
average sampling interval, we rewrite (46) as 


N ae 
=1+2 » M(k/2) 0x05(BxBy)* (51) 
eff k=1 
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or 








N Qaxay { 3.5re7/ llr 1 
=1+ sl a isle [pore eae (97 
Nef 17 . —-re Vv? J|-r 8000(1 — r) 2) 
where 
r= (BB,)°”. (53) 


Consider the horn and dish relative notch depths, Bx and Bp, 
respectively. Letting By = x and Bp = y, 


Ax = Qy = 0.75 
Bx = e755 
By = gue 
r = 0.98478. (54) 


Therefore, from (51) to (54), 





NV 56.3, (55) 
eff 

For the data base, the calculated value of the correlation coefficient of 
By and Bp was 0.0306 for the entire set of 85,410 samples. From (55), 
Ney = 1516. Using this in (45), we find o, = 0.0257. Under the 
unfavorable assumption” that the distribution of p is normal, we would 
expect to find a value of |p| this large (1.190,) or greater to occur for 
less than 27 percent of the samples of this size. 

Similarly, for the case of a subpopulation of 1608 samples of By and 
Bp, whose correlation coefficient was —0.118, we find from (55) that 
Neg = 28.5, and from (45) that o, = 0.187. We would expect to find a 
value of |p| this large (0.630,) or greater to occur for less than 53 
percent of the samples of this size. One concludes that in neither of 
these two cases does the correlation differ significantly from zero. 
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Matrix Analysis of Mildly Nonlinear, Multiple- 
Input, Multiple-Output Systems With Memory 


By A. A. M. SALEH 
(Manuscript received April 29, 1982) 


A matrix method of analysis is developed for mildly nonlinear, 
multiple-input, multiple-output systems with memory (e.g., nonlinear 
multiport networks and multichannel communication systems). The 
method is based on a Volterra-series representation whose kernels 
are two-dimensional matrices rather than multidimensional arrays. 
This is made possible through the use of the Kronecker product of 
matrices, which results in a compact formulation. The response of 
the aforementioned systems to multiple sinusoidal excitations is also 
studied. Moreover, formulas are given for various system operations 
(e.g., addition, cascading, inversion, and feedback), which can be 
used to describe a complex system as an interconnection of simple 
subsystems. 


l. INTRODUCTION 


Communication, control, and instrumentation systems employ com- 
ponents, such as amplifiers and mixers, which are inherently nonlinear. 
Even when the nonlinearities are mild, as is often the case, they can 
produce bothersome signal distortion that limits the system perform- 
ance. The nonlinear components themselves, and the other linear 
components used in the system, are generally frequency-dependent, 
i.e., they have memory. Numerous studies are available in the literature 
for the analysis of mildly nonlinear systems with memory through the 
use of Volterra-series expansions.’*' The classic paper by Bedrosian 
and Rice,’ the recent paper by Chua and Ng,“ and the book by Weiner 
and Spina” cover that subject very thoroughly. Also, the paper by 
Gopal, Njakhla, Singhal, and Vlach”’ is interesting in that it evaluates 
the range of accuracy of the Volterra-series approach by comparing it 
with a nearly exact, but quite involved, method of analysis. The book”® 
and paper’ by Schetzen deal mainly with random inputs. The condi- 
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tions for the existence of a Volterra-series representation have recently 
been studied rigorously by Sandberg.” 

For the most part, the studies mentioned above are limited to 
systems with one input and one output, i.e., “scalar” systems. This 
scalar representation is usually not easily applicable to Multiple-Input, 
Multiple-Output (mrMo) systems. Such systems include, for example, 
nonlinear multiport networks, multichannel communication systems, 
and transmitting or receiving systems employing multibeam antennas. 
In principle, one can represent these systems by a set of dependent 
scalar Volterra equations. This was done, for example, in the papers 
by Narayanan® and by Bussgang, Ehrman, and Graham,’ where node 
equations were used to analyze nonlinear, two-port network models of 
bipolar transistor amplifiers. This method of analysis is tractable only 
when the numbers of nodes and of nonlinear elements in the network 
are small. For example, when the above authors considered the analysis 
of two-stage transistor amplifiers, they were forced by the complexity 
of the cascade equations involved to assume that the interaction 
between the stages, i.e., the loading effect of one stage on the other, is 
linear. While this might have been a reasonable approximation in their 
particular case, it is not valid in general. A symbolic matrix inversion 
algorithm that simplifies the computational aspects of the nodal 
method of analysis was recently discussed by Thapar and Leon.’*® 

To conveniently handle the problem of two-port networks, or to 
analyze nonlinear multiport networks in general, one needs to use a 
black-box representation of the network, as is usually done in linear 
networks. For example, consider a nonlinear, two-port network, which 
has two independent port variables (e.g., the port currents) and two 
dependent port variables (e.g., the port voltages). One should be able 
to express the latter variables in terms of the former (e.g., by a 
nonlinear impedance representation). Furthermore, one should be able 
to perform transformations among various network representations 
(e.g., from impedance to cascade parameters), and to carry out the 
computations involved in interconnecting several networks together to 
form a complex network (e.g., through cascading). The same operations 
are also needed in the analysis of other nonlinear MIMO systems. 

The purpose of this paper is to develop a method for analyzing 
mildly nonlinear MIMO systems with memory. This method, which 
employs Volterra-series whose kernels are two-dimensional matrices, 
facilitates the systematic performance of various useful system oper- 
ations, such as addition, cascading, inversion, and feedback. The 
application of the results of this study to the analysis of mildly 
nonlinear multiport networks will be the subject of a future paper. 

Actually, Weiner and Naditch,’° and Gopal, Nakhla, Singhal, and 
Vlach” used multidimensional arrays of Volterra kernels to represent 
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nonlinear, two-port networks. The same was suggested by Chua and 
Ng” for extending their results to multiple-input systems. All of these 
analyses can also be generalized to multiport networks and other MIMO 
systems. The resulting notation is similar to the index notation dis- 
cussed in the beginning of the next section and in Appendix A. This 
notation, though more natural in its initial formulation, turns out to 
be cumbersome when attempting to perform the aforementioned sys- 
tem operations. 


ll. REPRESENTATION OF NONLINEAR MEMORYLESS MIMO SYSTEMS 


A nonlinear, memoryless scaler system is characterized by its in- 
stantaneous input-output transfer function. When this function is 
analytic, as is the usual case encountered in practice, it can be repre- 
sented by the power-series expansion 


w= P%Y + Py? + PMy3 + ..., (1) 


where u = u(t) is the input, w = w(t) is the output, and P”’, k = 1, 2, 
3, -+-, are system constants. The corresponding representation of a 
nonlinear, memoryless, MIMO system with n inputs, u; = u;(t), 7 = 1, 2, 
-+-,n, and m outputs, w; = w,(t), i = 1, 2, --- , m, is 


n non 
= (),,. 2) cag: 
wi = 2 Pi ,Uy, of > pa Pj, jUiltj, 
J\=1 j\=1 Jo=1 


n n n 
+ YY DY Pesta, +--+, t= 1,2, +++, m, (2) 
Ai=1 Jg=l jg=1 

where PY is k = 1, 2, 3, --- , are (k + 1)-dimensional m Xn X --- 
X n arrays of system constants. The notation used in (2) will be 
referred to as the “index notation.” It is similar to that used in Refs. 
10 and 12, but the superscripts and subscripts are interchanged. We 


now proceed to represent (2) in the “matrix notation.” 
Let 


uy(t) w(t) 
u2(t) W(t) 

u=u(t)= | — | ,w=w(t) = (3) 
un(t) Wm(t) 


be the n X 1 and m X 1 input and output vectors, respectively. The 
first (i.e., linear) term in (2) can be written as an ordinary product of 
matrices in the form w = P™.u, where P™ is the m X n matrix 
[P{?]. We will now show that the remaining terms in (2) can also be 
written in a matrix form through the use of the Kronecker product of 
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matrices.” Appendix B defines this product and gives some of its 
useful properties. Actually, Harper and Rugh” employed the Kro- 
necker product in conjunction with state variables to study factorable, 
scalar, nonlinear systems. Also, Brockett”*”® used a reduced form of 
the Kronecker product (to be explained shortly) in the state-variable 
representation of scalar, time-varying, nonlinear systems that are 
linear in the control variable. 

As is explained below, the elements of the (k + 1)-dimensional, 
mXnX +--+» X n arrays, {P#...,}, can be reorganized to form 
two-dimensional, m x n* matrices, {P“)}, such that (2) can be written 
in the matrix form 


w= P®.u + P®.(u x u) + P®-(uxuxu)t+---, (4) 


where “x” is the Kronecker-product sign. As mentioned in Appendix 
B, we will employ Jeff Kronecker products. 

To understand (4), we note from (61) that the k-fold Kronecker 
product u X u X --- X uresults in an n* x 1 vector whose jth element 
is given by 


fuxuX--- Xulj= Uj, °° Uj, (5) 

where Ji, jo, °++ , Je are uniquely determined from 
J=htn(jo— VW tee) +n? (je — 2D). (6) 
Thus, to make (4) equivalent to (2), the z-7 element of the m X nt 


matrix P“) should be given by 
[PM ly = Pi as (7) 


where 7 is given by (6). 
For example, if m = n = 2, (5)-(7) give 


ZAUAU AT 
U2, 
uu, UjU2U, 
U2u , U2l42lt 
uxu= Youxuxu= : (8) 
u,U2 UU U2 
u2u2 U2u,U2 
UjU2U2 
U2U2U2 
2 2 2 2 
p®? = PYPIP YoP Ye (9) 
acs 2 2 2 2 
PSPS PSPS 
3 3 3 3 3 3 3 3 
p® = P OP (iP {oP P se BoP {> (Sho (10) 
= 3 3 3 3 3 3 3 3 . 
P $4P $.P $).P $),P $oP BoP 2oP Boo 
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Note that each of the Kronecker-product vectors given in (8) has 
redundant entries. In Brockett’s notation cited earlier, these entries 
would be removed. For example, the 4 X 1 vector, u X u, would be 
replaced by the 3 X 1 vector [wi, w,u2, u3], and the corresponding 2 X 
4 matrix, P®, given in (9) would be reduced to a 2 X 3 matrix, etc. 
However, when the system has memory, no redundant entries occur, 
since, as can be seen for example from (11), one needs to evaluate 
Kronecker products of the form u(¢,) X u(é2), etc., where ti ¥ to. 

In the remainder of the paper, we will employ the compact matrix 
notation used in (4) rather than the index notation used in (2). 
However, on some occasions, it is helpful to keep track of the interre- 
lation between the two notations. Thus, some key equations in the 
paper are rewritten in Appendix A in the index notation. 


lll. REPRESENTATION OF NONLINEAR MIMO SYSTEMS WITH MEMORY 


The usual Volterra-series expansion used to represent nonlinear, 
time-invariant, scalar systems with memory’ can be generalized 
through the use of the notation of (4) to represent MIMO systems by 
the matrix equation 


w(t) = i p(n) sult = T1)d7 


+ {| p? (71, T2)-[u(t — m1) X u(t — 72) ]dridts 


~ [}] p(n, T25 73) 


-[u(t — 1) X u(t — T2) X u(t — t3) |dtidt2dr3 
Se ieee (11) 


where u(t) and w(¢) are, respectively, the n X 1 and m X 1 input and 
output vectors given by (3), and where p™ (7, ---, Te), & = 1, 2, 3, 
-++, are two-dimensional, m x n” matrices of system kernels. Note 
that if p’ (74, ---, 2) = P”8(r1) --- (7x), where 4(7) is the unit 
impulse function, then the system becomes memoryless, and (11) 
reduces to (4). 

As is the case for linear systems, it is more convenient to represent 
(11) in the frequency domain. To do this, we introduce the dummy 
time variables, t1, f2, --- , f;, and rewrite the &th order output com- 
ponent in (11) as 
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w(t, woe, tr) -| eee [ ps se* TR) 


-[u(é: — 71) X +++ & ult, — Te) |d71 -++ dtp. (12)* 
Thus, (11) becomes 
w(t) = w(t) + w(t, t) Fw tt) tee. (13) 


Now, we introduce the single-dimensional Fourier-transform pair 


X(f) = | x(t)exp(—j27ft)dt, (14a) 


x(¢) -| X(fyexp(j2n ft) df, (14b) 


to represent the transformations u(f) < U(f) and w(t) <= W(f). 
Similarly, we introduce the multi-dimensional Fourier-transform pair 


-exp[—j27(fiti + --- + fete) |]dti --- dtr, (15a) 


yl ot) = | ve | Yi fod 


-exp[j27(fiti + +++ + fate) |dfi --- dfz, (15b) 


to represent the transformations p™ (71, «++, t:) @ P™ (fi, «++, fr) 
and w")(t, «++, t2) @ W"'(fi, «++ , fx). It can be shown from (14) and 
(15) that (12) can be written in the frequency domain as (see Refs. 1 
through 4, 7 through 9, 14 and 17) 


W (fi, ++, fe) = POA, +++, fe) TU(A) X +++ xX U(fx)]. (16) 


The Fourier transform of the output becomes 


W(f) = Wf) + | Wf, f — Addfi 


+| | W(fi, fer f— fi — feddfidfz +--+. (17) 


¥ All equations in the paper marked by a dagger are rewritten in the index notation 
in Appendix A, where the same equation numbers are used. 
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Fig. 1—A nonlinear, time-invariant, MIMO system with memory having 7 inputs and 
m outputs. 





Fig. 2—Interrelations among the input, output components, and total output in the 
time and frequency domains for the nonlinear MIMO system of Fig. 1. The numbers in 
parentheses represent equation numbers in the text. 


Note from (13) and (17) that the single-dimensional Fourier transform 
of w(t, --- , £) is given by the kth term in (17), which is not equal to 
W")(f, «++, f) unless & = 1. 

A schematic diagram of the system represented by (11) through (17) 
is given in Fig. 1. The interrelations among the input, the output 
components, and the total output in the time and frequency domains, 
and the corresponding equation numbers, are indicated in the flow- 
chart of Fig. 2. 


IV. KERNEL SYMMETRIZATION 


The representation of the response of a nonlinear scalar system to 
sinusoidal and Gaussian excitations is greatly simplified if each of the 
kernels, P f,,--+ , fe), or equivalently, p (71, «++ , 7%), isa symmetric 
function of its arguments.’*"* The generalization of this symmetry 
requirement to nonlinear MIMO systems is somewhat more involved. 
Following the reasoning given in the aforementioned references, one 
can show that it is the output components, W”’( fi, --- , fe) given by 
(16), or equivalently, w”(t,, --- , t,) given by (12), that are required 
to be symmetric functions of their arguments. For example, for k = 2, 
it is required that W®( fi, f) = W (fe, fA); and thus, from (16), 


P?)( fi, fe)- (ui X Us) = P?'( fy, fi) (ue X wy), (18a) 


where U(f;) is replaced by u; for generality. Similarly, for k = 3, it is 
required that W(f;, fo, fa) = W'( fu, fo. f,); and thus, from (16), 
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P(A, h, fs)- (ui X Us X Us) = POUL: fe, fy) +(e x ug x uy), (18b) 


where a, 8, y assume all permutations of 1, 2, 3. For a scalar system, 
(18a) and (18b) are indeed equivalent to requiring the corresponding 
system kernels to be symmetric functions of their arguments, as 
mentioned above. 

To find the symmetry requirement implied by (18) on the kernels of 
a MIMO system, we need to introduce the n” x n? “reversing” matrix, 
R, and the six n® x n® “permutation” matrices, ®,,,, where a, 8, y 
assume all permutations of 1, 2, 3. These matrices have properties such 
that if ui, Us and uz are n X 1 vectors, then 


R-(u; X ue) = Us X th, (19) 

Dypy° (ui X U2 X Us) = Uy X Ug X Uy. (20) 

Appendix C defines these matrices and gives some of their useful 
properties. 


Finally, (18) through (20) give the required symmetry conditions of 
the kernels as 


POC hr) oa P? (fr, fi) -R, (21a)? 
P(A, h; fs) = P (Zn; fe, fy) -Pagy- (21b)* 


The generalization of (21) to higher-order kernels requires the intro- 
duction of permutation matrices of more than three indices. 

If the given system kernels, say, P® (fi, fo) and P® (fi, fr, fs), are 
unsymmetric, they can be symmetrized, i.e., made to satisfy (21), 
through the use of the relations 


P(A, f) = % (P(A, &) + P(A, A)-R], (22a)t 
P® (fi, hy f) =% » P(f,, fe, fy) -Bapys (22b)* 


where the summation is performed over a, 8, y assuming all 6 permu- 
tations of 1, 2, 3. These symmetrization relations are generalizations 
of those discussed in Refs. 7, 9, and 14 for scalar kernels. 


V. RESPONSE TO SINUSOIDAL EXCITATION 


The response of a nonlinear scalar system to multiple-sinusoidal 
excitation has been studied by several authors including Bedrosian 
and Rice,’ Goldman,® and Chua and Ng.” Here we generalize some of 
their results to nonlinear MIMO systems. 


5.1 Multiple-exponential excitation 


Let the input vector be 
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1 
u(t) = ), wexp(j27fit), (23) 


where the u/s are time-independent, complex, 7 X 1 vectors. The 
Fourier transform of u(¢) is 
l 
U(f) = X uid(f — fi). (24) 


t=1 


Substituting (24) into (16), and using (15b), one obtains the kth order 
output component 
L I 


w(t) = w(t, PO t) = p> ants 2 {P? (A, aces ) fi,) 


tj=1 = 


*(u;, X +++ X ui) Jexp[y27(fi, + --- + fi,)t]}. (25) 


Finally, the output, w(t), is obtained from (13), i.e., by summing w™? (t) 
from k = 1 up to any desired order. Note that (25) is valid whether 
or not the system kernels are symmetric. 


5.2 Single-frequency excitation 
Let the (real) nm X 1 input vector be 
u(t) = Real[a exp(j27/t) | 
= %a exp(j27ft) + “Ya*exp(—j2z/ft), (26) 


where the asterisk refers to complex conjugation. Comparing (26) to 
(23), one obtains 7 = 2, uw. = ‘2a, ue = 2a*, fi = f, and fp = —f. Thus, 
using (25), and assuming that the kernels are symmetric, i.e., that (18) 
is satisfied, one obtains the following expressions for the various kth 
order output components: 


w(t) = % [P™(f)-alexp(j27ft) 
+ %[P®(-f)-a*]exp(—j27ft). (27a) 
w(t) = % [P? (f, —f)-(a X a*)] < (d — c term) 
+ %[P(f, f)-(a X a) Jexp[j27(2f)t] 
+ “%[P° (—f, —f)-(a* X a*)]exp[—j27(2f)t]. — (27b) 
w(t) = % [POF f —f)-(a X a X a*) Jexp(j27ft) 
+ %[P®(-f, —f, f)-(a* X a* X a) ]exp(—j27ft) 
+ %[P°(F, f f)-(aX ax a)lexp[j27(3/)t] 
+ %[P®(—f, —f, —f)-(a* X a* X a*)Jexp[—j27(3f)t]. (27c) 


Note that the asterisks on the a’s correspond in number and location 
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to the negative signs in the frequency arguments of the associated 
kernels. 


If the system is real, ie., if p™ (71, --- , 72), R = 1, 2, 3, --- , are real, 
then it can be shown from (15) that 
P”(f, +++, fe) = [P@(-A, +--+, “fe)]*. (28) 


As expected, (28) implies that all the output components given in (27) 
are real. In that case, it can be shown through generalizing (27) that 
the total mth harmonic output term, w,,(¢), m = 0, 1, 2, --- , is given 
by 


k 
W(t) = €mReal ao janmf) y g-k k 5 m 


kh=m,m+2,.-- 


POF 00, ff vee, f)-falerm] x (atyle-mA]] , (29) 
ee il 
(k+m)/2 (k—m)/2 


where a”! is the J-fold Kronecker product a X --- X a, &m is the 
Neuman factor (which is equal to 1 when m = 0, and is equal to 2 
when m ¥ 0), and P is defined to be zero. 

Because of the symmetry conditions of (21), the kernels used in (27) 
and (29) satisfy the relations 


Pf, f) = POE f)-R, (30a)" 
P!( f, -f) = [POF —-f)]*-R, (30b)" 
POCA ff) = POL f —f)-Pas, (30c)' 
POR A) =P £ f)-Papy- (30d)* 


In addition to the kernel symmetry requirement, (30b) is based on the 
assumption that the system is real, i., that (28) is satisfied. The 
implication of (30) is that the elements of each of the system kernels 
are not all independent. For example, if n = 2, (30a) through (30d) 
imply, respectively, that (z) columns 2 and 3 of P”( f, f) are equal; (iz) 
column 2 of P®(f, —f) is the complex conjugate of column 3, and 
columns 1 and 4 are real; (iii) columns 2 and 3 of P®(f, f, —f) are 
equal, and so are columns 6 and 7; and (iv) columns 2, 3, and 5 of 
P®(f, f, f) are equal, and so are columns 4, 6, and 7. It is worth 
mentioning that (30a) and (30d), respectively, would also be satisfied 
by P® and P® of the memoryless system represented by (4). 


5.3 Two-frequency excitation 
Let the (real) n X 1 input vector be 


u(t) = Realfa exp(j27f.t) + b exp(j27fst) ]. (31) 
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We assume that the system is real, and that the kernels are symmetric, 
ie., that (28) and (18) are satisfied. One can use (25) to obtain the 
output corresponding to (32) by following the same steps used to derive 
(29). The leading terms at some of the various output frequencies are: 


w(t) |a-e = 4[P°( fa, —fa)-(a X a*) + P?(f,, —fs)-(b X b*)]. (32a) 
w(t)|;, = Real{exp(j2afat)[P( fa)-a 
+ ¥%4P°( f, fa, fa): (a X a X a*) 
+ %P°(f,, fo, —fr)-(a X b X b*)]}. (32b) 
w(t) |27,-4,  Real{% exp[j27(2f. — fo)t] 
-P®(f,, fr, —fs) (a X aX b*)}.  (32e) 


w(t) |1,+ms, ~ Real ae 4 i. exp[ j27(Jfa + mfz)t 


POPP CE. SSS ag oe a h; See | a fs) [al x wi}, i. 


~~ ee 
L m 


where / > 0 and m = 0, and where we defined b* = b and b = b*. 


5.4 Three-frequency excitation 
Let the (real) n X 1 input vector be 
u(t) = Realla exp(j2zfat) + b exp(j2af,t) + ¢ exp(j2mfct)]. (33) 


Again, we assume that the system is real, and that the kernels are 
symmetric. Following the same steps used to derive (29) and (32), one 
can obtain the following leading terms at some of the various output 
frequencies: 


wit) |a-e = %[P (fa, —fa)- (a X a*) + P™ (ft, —fr) 
-(b X b*) + P®( f., —f-)-(e X e*)]. (84a) 
wi(t)|,;, = Real{exp(j2afat)[P ( fa)-a + “UP ( fa, fa, —fa) 
(aX aX a*) + 4P°( fi, fo, fs) (a X b X b*) 
+ %P®( fa, fe, -fe)-(a X ¢ X e*)]}. (34) 
w(t) |¢,+4,-1. ~ Real{% exp[j27( fa + fo — ft] 
-P® (fa, fo, —fe)- (a X b X e*)}.  (34e) 


MATRIX ANALYSIS 2231 


w(t) | rf, £17, © mf, 


= Real {2 Mote ee + Ufs&) mf.)t] 
perms. see, Tiss + Tis oes + fr,Dfes ae a) 
k I 
-[al#! x (b*)! x (eo) (34d) 


where k, J, m = 0, but at least one of them being nonzero, and where 
the sign symbols + and © are each consistent throughout the equation, 
but are otherwise independent. 


Vi. SYSTEM OPERATIONS 
6.1 Operational notation 


Let the input-output relations given in (11) through (17) be written 
symbolically as 


Wr = {PP }0Un, (35) 


Co 9? 


where “o” means “operating on.” The frquency dependence has been 
omitted for simplicity. The subscripts n and m are included to empha- 
size the numbers of inputs and outputs. On some occasions, these 
subscripts will be eliminated. 

If the system is linear, i.e., if P“” = 0 for k > 1, the operation in (35) 
reduces to an ordinary matrix product. Thus, 


W = (P®}o0U = P”.U. (36) 


The operational notation of (35), and the three system operations of 
addition, cascading, and inversion, which are discussed in the next 
three subsections, form an algebraic structure that permits a shorthand 
description of complex interconnections of nonlinear MIMO systems. 
The laws of this algebra’ are identical to the algebra of linear systems 
(i.e., the algebra of matrices) with two important exceptions—the left 
distributive law does not hold, and the laws of multiplication by a 
scalar constant are more complex. 


6.2 Addition 


Two systems, {P®,} and {Q‘?,}, having the same number of inputs, 
n, and the same number of outputs, m, are said to be “added” if they 
share the same input vector, U,, and if their respective outputs are 
added to form the final output vector, Wm. This operation, which is 
shown schematically in Fig. 3, is represented by 
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L il ‘ Sin 


Fig.3—A schematic representation of the addition operation {S%,} = (P®,} + 


{Qn 


Wr = {P%,}0Un + {QU,}0U, 
= [{P®,} + (Q%,}]oU, 


= {S®}oU,. (37) 
The kernels of the sum system, 
{Stn} = (Prth} + {Qn}, (38) 


are given by 


S”(A, oP fe) i Pen: BIS he) + Q” (A, Pee fe), (39) 


where the plus sign refers to matrix addition. 

One can define a subtraction operation in an obvious manner. A 
multiplication operation,”’* which is more involved, can also be de- 
fined. 


6.3 Cascading 


When the output vector, Wm, of a system,{P™,}, is used as an input 
vector to a second system, {Q}*}}, whose output vector is X), the two 
systems are said to be in “cascade.” This operation, which is shown 
schematically in Fig. 4, is represented by 





Fig. 4—A schematic representation of the cascade operation {T{}} = {QS} * {Pn}. 
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xX, = {Q))}0Wm 
= {QiR}o[{Pr}oUn] 
= [{Qin} * {Pm} ]oU, 
= {Tin}oU,, (40) 


where the asterisk refers to the cascade operation. The kernels of the 
cascade system, 


{Tin} = {Qin} * {Pm}, (41) 
can be obtained by substituting the output expression of the first 
system into the system equations of the second system, as was done in 
Refs. 3 and 9 to derive the cascade relations of scalar systems. This 
procedure is straightforward, but somewhat tedious. A simpler ap- 
proach is to employ the harmonic probing method discussed in Refs. 
7 and 9, and the expression for the response of nonlinear vector systems 
to multiple-exponential excitation given in (25). The resulting relations 
for the cascade kernels are 

T( fi) = Q™( fi) P(A), (42a)" 
TR, fr) = QA a fr) P(A, fh) 
+ Q( fi, 2) [P(A) x P™(A)],  (42b)t 
T (A, fy f) = Qf + fh + fr) P(A, fy f) 
+ Q° (fy fa + fa) [PO (fi) x P?(f, fad] 
+ Q°( fi + fy fs): [PO (A, f) x P(f)) 
+ Q% (fy, fy f) [PO (fi) x P( fr) x P(f)]. (428)! 


A generalization of (42) for arbitrary k is given in Appendix D. 

If the kernels of the cascaded systems are symmetric, i.e., satisfy 
(21), then it can be shown that the resulting second-order kernel given 
by (42b) is also symmetric. However, the resulting third-order kernel 
given by (42¢) is not symmetric. This fact is indicated by the presence 
of the circumflexes. 

As mentioned in Section IV, it is desirable to deal with symmetric 
kernels. Thus, using the symmetrization relation given in (22b), assum- 
ing that the kernels of the cascaded systems are symmetric, and 
employing the properties of the reversing and permutation matrices 
given in Appendix C, one obtains the symmetric form of (42¢) as 


T° (A, fs f) = QA + h + f)-PO(A, fy f) 
+ #{Q°( A, fh + fs) [PO (A) x P(f, f)] 
+ Q°(f, fe + fi) [PO(f) x P (fs, fi)] -Bea1 
+ QO (fi + fr f)- TP (A, &) x P(A) 
+ QM (A, f, 6) (P(A) x P?(f) x P(fr)],  (42c)4 


where ®»3; is defined in (68) and (69). 
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If the first system, {Pm’,}, is linear, (42) reduces to 


TO (fi, +5 fe) =Q°U Ay +, fe) TPO(A) X ++ X PO (A). (43) 


On the other hand, if the second system, {Q))}, is linear, (42) reduces 
to 


TO (fy e+, fe) =Q0(fi + +++ + fe) POCA, +++, fi). (44) 


6.4 Inversion 

Let the numbers of inputs and outputs in the system represented by 
(35) be equal, i.e., m = n. Suppose that it is required to find the input 
vector, U;,, in terms of the output vector, W,. This inversion operation 
is represented by 


U, = {(P} 71 0Wn = {(Q®}0Wn. (45) 
To find the kernels of the inverse system, 
{Qrn} = {Pen}; (46) 


it is helpful to use the interpretation given in Fig. 5, which defines the 
inversion operation in terms of the cascade operation and the identity 
system, {1,}, where 1, is the n X n identity matrix. Thus, applying the 
symmetric cascade relations of (42) to Fig. 5b by interchanging the 
roles of P and Q, setting T” = 1,, and T” = 0 for & > 1, and solving 
for Q”, one obtains the symmetric inversion relations 


Q?( A) = POAT, (47a) 





Fig. 5—Two equivalent interpretations of the inversion operation {Q®,} = {P&}7): 
(a) ({Q&} * (P®} = (1,}, and (b) (P} * {(Q} = {1,}, where {1,} is the identity 
system. 
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Q° (A, £) = -Q°(fi + f)-PO(A, £) 1Q°(f) X QO(f)], (47) 
Qf, f, Bh) = -Q° (fi + fh + fi) APA, fh +f) 
[Q°(A) x Q(f, f)] 
+P f, f+ fi) -[Q™(f) x QC fs, fi) ]-®ea1 
+ P(A + f, f)-[Q°(A, f) x Q( fT} 
+ P(A, fy A) TQ°(A) x Q™(f) x Q™(f)]).  (47c) 
Note that the inverse system exists if and only if P”( f) is nonsingular. 


6.5 Feedback 


As an application of the three system operations discussed in the 
previous subsections, consider the nonlinear, feedback, Mimo system 
shown schematically in Fig. 6, where both the forward, {P,}, and 
reverse, {Q‘),}, branches are nonlinear. Using the operational notation 


of (35), one obtains 
Wn, = {P®,}0Xn, (48a) 


where U,, X, and W,, are the n X 1 input vector, the n x 1 intermediate 
vector, and the m X 1 output vector, respectively. Substituting W,, 
from (48a) into (48b), solving for X,, in terms of U,, and substituting 
the result in (48a), one obtains the feedback system equation 


Wr = {Frn}oUn, (49) 
where 
{Fin} = {Pin} * [(1n} — (QU) * (PRT. (50) 


Thus, the kernels of the feedback system can be obtained by applying 
the subtraction, cascade, and inversion operations discussed above. 
However, the explicit formulas for these kernels will not be given here. 





Fig. 6—A schematic representation of a nonlinear Mimo feedback system. 
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Actually, special cases of these formulas have been obtained for scalar 
systems in Refs. 2, 6, and 7. 

Note from (50) and Section 6.4, that the feedback kernels exist if 
and only if the n X n matrix [1, — Q%),.(f) - P®?.(f)] is nonsingular. 
Note also that if m = n, and if P?),( f) is nonsingular, then (50) reduces 
to 


{Fra} = [{Pan}) — {Qna}T". (51) 


If the system in Fig. 6 is changed to a negative feedback system, 
then the minus signs in (50) and (51) should be changed to plus signs. 


Vil. CONCLUSIONS 


A method of analysis has been presented for mildly nonlinear MIMO 
systems with memory. The method utilizes Volterra series whose 
kernels are two-dimensional matrices. The analysis was made possible 
through the use of the Kronecker product of matrices, which is a 
simple but powerful tool in matrix theory. This results in a compact 
representation of the system equations, and facilitates the systematic 
performance of various useful system operations, such as addition, 
cascading, inversion, and feedback. These operations can be used to 
describe a complex, nonlinear MIMO system as an interconnection of 
simple subsystems. 
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APPENDIX A 


Index Notation 


Here we rewrite, in the index notation, some of the key equations 
marked by a dagger (') in the body of the paper. The same equation 
numbers are used here as are used in the text. Before doing so, 
however, we note from (7) that, for MIMO systems with memory, the 
matrix kernels used in the matrix notations are related to the array 
kernels used in the index notation by the relations 


[p®(n, ea TR) li = Dine cie(T, ier | Th), (52) 
[P@ (A, ++. fly = PR al As ++, fs (53) 


where / is given by (6). 
A list of the equations in question follows. 
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w(t, +++, &) = oe BLP « | vit. plT1, °° 4 T Tr) 


. uj, (ti _ 71) eve Uy, (te — T,)AT1 eee dn} (12)" 


WH Cf fi) = oe SPR foo 
Jr= 


Ji=1 
-Uj,( fi) «+» Uj ( fe). (16) 
Pil hs f) = Pir a( fy A). (21a)* 
PR als fs B) = Pip int (fe fos fy- (21b)* 


Po fy fh) = 4[Po 0h, fA) + PO(f, A]. (22a) 


PRasl fs fe BY = % YD Py hr fn f- (22)' 
Pilh 1) = PRA P)- (30a)* 

PR -f) = (PRK AT. (30b)* 

Piranth ff) = PRaalh f -f)- (30c)' 
Piglbt pe PA ip: (30d)" 

Ti (fA) = S Qia (A)Punl fd: (42a)' 


Tiilhs = > QA + AP oalh, f) 


+ x Qical fy AYP f)P gal fa). (42) 
Piniaad fis hs fh) = DL Qin fi + fe + f)Paniniel fis fs Bs) 
+2 EY (Qisol fy fh + APan AP pal fs fi) 
+ Qical fi + fy f)Pojuel fir Past f)] 
+ DDS Qiao fa fy AP an APanl BYP yale). (426)! 
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Tirnil fy fy A= Y Qh + h+ f)Painal hs fy f) 


2 m m 
+5 DD AQti A, A+ AIPM ADP Hal fe, f) 


+ Q2h(fh, fet f)P2(A)PQAChs A) 
+ QEh(h, At APQ(R)Pe aA, f)] 


+>D > Y Q&A, hk, APU(AP pe AP Rh). (42c)' 


a=1 B=1 y=1 


APPENDIX B 
Kronecker Product of Matrices 


Here we define the Kronecker product of matrices and summarize 
some of its properties that are used in this paper. More extensive 
coverage of this topic is given in Refs. 22 through 24. 

Let A = [a;,;,] and B = [6;,;,] be ma X na and ms X no matrices, 
respectively. Their Kronecker product results in the mamy X nan 
matrix, C = [c;,;,], given by 


Abi Abs +++ Adbru, 
Abs; Abre Abon, 
C=AxB= ie (54) 
Abii ADs vee ADingn, 
where “xX” is the Kronecker-product symbol. Thus, 
Cinje = Dig jgDis js (55a) 
where 
le = la + Malis — 1), (55b) 
Je = Ja + Na(Jo — 1). (55c) 


Note that, since tg = ma and ja S Ma, (55b) and (55c) have unique 
solutions for ta, %, Ja and j, in terms of 7. and j-. Actually, (54) and 
(55) define the left Kronecker product.” One can also define a right 
Kronecker product,”*™* which, however, is not used in this paper. In 
general, 


AXB+#BXA. (56) 
It can be shown that the Kronecker product has the following 


properties: 
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AX(BxXC)=(AxXB)XC=AXBxC. (57) 


(A+B) x C=(AxC)+ (Bx OC). (58) 
AX (B+ C) =(A x B) + (Ax OC). (59) 
(A-B) X (C-D) = (A x C)-(B x D). (60) 
(Ax B)t=ATXxXBUL (61) 

(A x B)? = A? xB’. (62) 


In the above equations, “T”’ refers to matrix transposition, and the dot 
implies ordinary matrix multiplication. The dimensions of the various 
matrices are arbitrary, but of course, should be consistent with the 
requirements of the inversion, addition, and ordinary multiplication 
operations, where applicable. 


APPENDIX C 
Reversing and Permutation Matrices 

Here we define the n” x n? reversing matrix, R”, and the six n® x 
n * permutation matrices, 0” which satisfy (19) and (20). The super- 


script “(n)” is used in this appendix to emphasize the dimensions. It 
can be shown from (19) and (55) that R™ is given by (cf. Ref. 24) 


Ree 1),k+n(-1) = bidjx, L, J; k, l = i, 2, vee SN, (63) 


where 6d,g is the Kronecker delta, which is equal to 1 if a = £, and 0 if 
a ~ B. For example, 


1000 
0010 
(2): = 
m 0100 (64) 
0001 
It can be verified that 
R® = [R}"=[R°}, (65) 


where ‘“T”’ refers to matrix transposition. Moreover, if Mi and M2 are 
m X n matrices, then 


R™.(Mi X M2)-R”™ = Me x Ma, (66) 


which is a generalization of (19). 
It can be shown from (19), (20) and (60) that 


O{3), = [O13]? = (OfB] 7 = Ls, (67a) 
@{2) = [(O(2]7 = (014) =1, x R™, (67b) 
08, = [O93]? = [O93] = R”™ x Lh, (67c) 
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OS), = (O9]7 = [O49] 7 = [1.x R™)-[R™ x 1,1], (67d) 
OS}, = [(©3)]" = [03)]' = [R” x 1,.]-[L. x R™], (67e) 
0%, = [©2)]" = [@9) = [1, x R®)]-[R® x 1,]-[1, X R®™] 

= [R” x 1,]-[l, x R” ]-[R™ x 1], (67f) 


where 1, and 1,3 are the n X n and n° x n® identity matrices, 


respectively. Also, it can be shown from (20) and (55) that, if a, B, y are 


any permutation of 1, 2, 3, then the i-7 element of ®%), is given by 


(OO) ix = 6:,7,9i, 59% 45> (68a) 

where 
i =i, + nig — 1) + n7(i, — 1), (68b) 
J= fit n(j2- 1) + n*(73- 0), (68c) 


and where 2), Ji, lz, J2, ls, Js = 1, 2, --- , n. For example, (67d) and (68) 
gives 


100000 0 0 
001000 0 0 
00001 0 0 0 
onecomr-[o 2398818) ay 
00010 0 0 0 
000001 0 0 
000000 0 1 


It can be verified that if M,, Mz and M3 are m X n matrices, then 
Oy» (M, x M2 x Ms) - [oe be = M, xX Mz x M,, (70) 


which is a generalization of (55). Also, if M and N are m X n and 
m’ X n” matrices, respectively, then 


@).(M x N)-[®$3)]” = N x M, (71a) 
3 -(N x M)-[®$2]7 =M XN. (71b) 


Moreover, if M and K are m X n and m X n” matrices, respectively, 
then 


R”™ .(M x K)-[®%)]7 = K x M, (72a) 

R™.(K x M).[@$}]7 =M x K. (72b) 

Finally, if M and L are m X n and m? X n matrices, respectively, then 
0) .(M x L)-R” =LxM, (73a) 

o}.(L x M)-R” =M XL. (73b) 
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APPENDIX D 
General Cascade Relation 


Here we give a generalization of the unsymmetric cascade relations 
given in (42a), (42b), and (42¢) for arbitrary k, (cf. Refs. 7 and 9 for 
scalar systems) 


k k—-l+1 
T” (A, Sh) = >» > Qf + Pek, A fs fe,+1 
l=1 ky,kg,- ++ ,RI=1 
(ky +kot+--++ki=k) 
tee + fet toe) fren tee + fe [P(f, vee, fix) 
x P2)( f+) s2+ fath) Xoe+ X P™( fen, -++, fx)] >. (74) 


Note that the second summation contains & 7 :) terms, and that the 


1 
frequency arguments always appear in the order ff, fo, ---, fr. As is 
the case with (42¢), the cascade relation of (74) does not preserve 
kernel symmetry for k = 3. The symmetric form of (74), which would 
generalize (42c), will not be given since it requires the use of permu- 
tation matrices of more than three indices, which have not been 
introduced yet. 
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Computing the Distribution of a Random 
Variable via Gaussian Quadrature Rules 


By M. H. MEYERS 
(Manuscript received March 26, 1982) 


Using the technique of Gaussian quadrature rules, a new estimator 
is proposed for approximating the distribution of a random variable 
given only a finite number of its moments. The estimator is shown by 
numerous examples to be accurate on the tails of both continuous and 
discrete distributions. Efficient algorithms exist for computing the 
estimator from the first 2N moments of the random variable. A robust 
implementation of the estimator is presented, along with rules that 
provide additional protection against computer roundoff errors. 


l. INTRODUCTION 


In this paper we present a method for computing the Cumulative 
Distribution Function (cpF) of an arbitrary random variable. Using 
the theory of Gaussian Quadrature Rules (Gqrs), we derive an esti- 
mator that converges asymptotically to the true cpr. In practice, 
convergence is obtained without excessive computation. A general 
estimator is developed here that is applicable to a wide class of 
problems. 

Section 2.1 begins with a review of Gqr analysis as it has traditionally 
been used for numerical integration. Several authors have shown the 
existence of extremely efficient algorithms for computing the param- 
eters of the car. An efficient and robust procedure for obtaining the 
GQR parameters is presented in the appendix. Two CDF estimators 
based on GQR are derived in Sections 2.3 and 2.4. The first estimator 
is most suited to numerical integration schemes and estimation of 
discrete distributions, while the second is appropriate for continuous 
distributions such as Gaussian noise or crosstalk. Section III gives 
numerous examples that show the inherent accuracy of the technique 
for continuous, discrete, and mixed distributions. Computational meth- 
ods for deriving the required moments are discussed, along with 
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modifications that tend to mitigate the roundoff errors that plague 
GQR analysis of nonsymmetric distributions. 


Il. THEORY AND PROPERTIES OF GQR 
2.1 Classical use of GQR 


The Gar has traditionally been used as a numerical integration 
procedure and is particularly efficient for computing integrals of the 
form 


b 
| f(x) w(x)dx 


where the integrand has been factored into a non-negative term w(x) 
and a strongly continuous term f (x). 

The first application of GQr in the communications literature’ was 
motivated by the work of Golub and Welsch? and Sack and Donovan’, 
who showed that the non-negative factor w(x) need not even be 
completely known to compute the desired integral. Only a finite 
number of the moments of w(x) are required to find the desired 
integration rule. Benedetto et al.’ noticed that the problem of error 
probability evaluation in the presence of intersymbol interference (Is!) 
could be posed in this form. Other applications of the Gar technique 
can be found in Refs. 4 through 9. 

In this paper, we apply the Gar technique to a larger class of 
problems where f(x) need not be continuous. We begin by reviewing 
a fundamental result in the theory of Gqr. 


Theorem: Let w(x) be a non-negative weight function defined on (a, 
b). Then if f(x) has continuous derivatives up to order 2N (see Refs. 
10 through 13), 


b 
J = | f (x)w(x)dx 


N 
= Y Af(ti)+ Rv a<é<b 


a<t:<b t=1,2.---N, (1) 
where 
— FP 
Ry(§) ~ New” a<&<b, (2) 


f°") (x) is the 2Nth derivative of f(x) and (2N)! is 2N factorial. The 
nodes {t;} are the distinct real roots of the unique Nth degree 
polynomial 
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N 
Pn(x) = kn It (x — ti), kn>0O. (3) 


The polynomials p,(x) are orthonormal with respect to w(x), L.e., 


5 
| W(X)Pm(x)pn(x)dx = f i Z . 


The strictly positive weights (or Christoffel numbers) are in turn 
given by 





—kns1 1 
= ete ee =1,2..-N, 4 
kn pnailti)p (ti) . 
where 
dpn(t) 
, t; eS 
Pyn(t) dt 


t=t; 


The 2N-tuple {A;, t;}%; is known as the N-point rule corresponding to 
w(x). 

If f(x) is a polynomial of degree (2N — 1) or less, the remainder 
Ry() equals zero and the GaQrR is exact. This affords the maximum 
degree of precision (i.e., the maximum degree polynomial that is 
integrable with no error for an N-point rule) possible with a quadrature 
formula of the form of (1).!°!? When the remainder is not zero, it can 
be bounded in magnitude to obtain upper and lower bounds on J. The 
bounds obtained in Ref. 1 for the IsI and Gaussian noise problem are 
often loose though, and convergence of the N-term summation in (2) 
is usually much faster than might be inferred from bounds on Ry(§). 


2.2 Methods for computing GQR 


Several algorithms are known for efficiently computing the rule for 
an arbitrary weight function w(x). Extremely useful procedures have 
been discovered by Golub and Welsh,”? Sack and Donovan,® and 
Gautchi.* The outstanding merit of these techniques is that the N- 
point rule corresponding to a given w(x) can be computed from the 
moments 


b 
n= | x'w(x)dx. (5) 


Because explicit knowledge of the weight function w(x) is not required, 
the Gar procedure is a powerful tool for the analysis of communications 
systems. 

Details of an algorithm for computing GQR are given in the appendix. 
Our algorithm is a modification of Gautchi’s procedure,’* which tends 
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to reduce computer roundoff errors. The critical stage in the algorithm 
is the Cholesky decomposition of a positive definite matrix of moments. 
The standard Cholesky decomposition used in Refs. 1 and 2 fails when, 
because of limitations of machine accuracy, the matrix is no longer 
positive definite due to roundoff errors. Improved accuracy is obtained 
by using an alternate method of performing the Cholesky decomposi- 
tion* that avoids taking a square root at each step in the algorithm.” 
Combining the alternate Cholesky decomposition with the modified 
moment algorithm of Gautchi“ yields an extremely stable method for 
obtaining Gar. Further discussion of techniques to mitigate computer 
roundoff errors is found in the appendix. 


2.3 Computing the distribution of a random variable via GQR 


In Ref. 1 GQrs are used to obtain the exact probability of error for 
digital transmission in the presence of IsI and Gaussian noise. The 
problem was reduced, via the GQrR approach, to computing the mo- 
ments of the IsI and letting f(x) in (1) be the probability of error 
caused by Gaussian noise conditioned on the Isi. The 1s! moments can 
be computed via Prabhu’s method”* when the data symbols are inde- 
pendent. For a large class of correlated data, the moments can be 
efficiently computed via the modified Cariolaro-Pupolin algorithm.’”® 
Both of the above procedures are easily implemented and have a 
complexity that grows only linearly with pulse duration. 

While there have been numerous applications of Gar to problems in 
the literature, all those known to us have had the restriction that the 
function f(x) has continuous derivatives up to order 2N. Presumably, 
this is because of the desire for strict bounds on the error term in (2). 
If we are willing to forego the analytical error term and consequently 
accept an empirical convergence of (1), we can apply the GQr technique 
to a larger class of problems with excellent results. 

The following theorem shows that no continuity requirements need 
be imposed on f (x). 

Theorem: (see Ref. 19) If W(x) is a fixed, nondecreasing function with 
infinitely many points of increase and the Riemann-Stieltjes integral 


b 
| f(x)dW(x) 
exists, then 
b N 
| f(x)dW(x) = im X Arf (ti), (6) 


* Applying the alternate Cholesky decomposition to the GQr problem was suggested 
by L. Kaufman. Subsequently, the same approach was found to have been independently 
proposed in Ref. 4. 
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where {Ai, ti}%; is the G@r corresponding to the moments 


b 
w=] x'dW(x) i= 0, 1,2, --- 2N. 


a 


The function f(x) is arbitrary as long as the integral in (6) exists. 
Because the cpF of a random variable is a nondecreasing function, 
we write the statistical expectation of the function f(x) as 


b 
E{ f(x)] = | f(x)d W(x), (7) 


where W(x) is a probability measure with infinitely many points of 
rise. Choosing f(x) to be the indicator function 


f(x) = $(x) 


1 x=a 
-{5 x>a, (8) 


we obtain the distribution function of the random variable as 


b 
[ erawen = lim YA; 
SN 


a N->o _, 
= lim Wy(a), (9) 
N->o 
where 
Ww(a) = ¥ Ai (10) 
sh 
and 


Sh = {iti <a} 


is the set of indices for which ¢; = a. 

Since the rule can be obtained from the {u,;}740, we have a means of 
constructing an approximation to the cpF of a random variable from 
its moments. In the limit as N approaches infinity, eq. (9) is exact at 
each point a. 

This leads us to propose the following estimator 


W(x) = Wr(x) = ¥ As (11) 

Sy 
This estimator gives a staircase approximation to the true cumulative 
distribution that becomes increasingly fine as N increases. Equiva- 


lently, each (Aj, t:) can be considered a point mass of a discrete 
approximation to the true probability density function. 
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While Szego’s theorem proves the asymptotic convergence of the 
estimator to the true CDF when W(x) has an infinite number of points 
of rise, a different result holds for a discrete distribution with a finite 
number of points of increase. 

Theorem: If W(x) is a fixed, nondecreasing function with M < 
points of increase, then 


b 
i ba(x)d W(x) = lim Wy(a). (12) 
; NoM 


Proof: An alternate formulation of Gaussian Quadrature” is as the 
purely algebraic solution to 


y= ; Ati? j=0,1, - 2M. (13) 
Now we assume the unknown discrete PDF is of the form 
w(x) = ¥ Aste — t;). 
The moments of this random variable are given by 
x Ati)’, 


which is identical to (13) for N = M. 

Finally, we consider the behavior of the Gqr for N larger than the 
number of points of increase M. The result is that the algorithm breaks 
down entirely. This is because a discrete distribution that takes on 
exactly M values is completely characterized by its first 2 moments 
and the addition of redundant moments to the problem causes the 
procedure to fail when the Hankel matrix of moments [eq. (21)] 
becomes nonsingular. 


2.4 A modified GQR estimator 


The following is a modification of the estimator Wy(a) that has been 
found to be more accurate in many applications. Instead of assuming 
that the approximation PDF is composed of point masses, we assume 
that each area of mass A; is more accurately modeled by a narrow, 
even symmetric, distribution centered around the point ¢;. Thus, we 
propose the smoothed estimator Wx(a) which, when evaluated at a 
node, equals 


W(t) = Ww(ti) — = (14) 


Between nodes, WX(a) is given by any “smooth” interpolation routine. 
A simple linear interpolation was found to be sufficient in the examples 
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Fig. 1—Convergence of Gar estimator for Gaussian PDF. 


that follow. This estimator does not have the jump discontinuities of 
the estimator Wy/(a) and is intuitively more satisfying because it fits 
a smoother distribution to W(x). Wi(a) has been found to give more 
accurate results when applied to known continuous distributions and 
to discrete distributions when M is much greater than N. 


ill. APPLICATION TO ARBITRARY DISTRIBUTIONS 
3.1 Known continuous random variable case 


To show the convergence properties of the GQr technique, we 
illustrate the behavior of Wn(a) and Wi(a) with some examples. We 
begin with the Gaussian distribution. Assuming a zero mean, unit 
variance random variable X, we compute the GQR estimators for 
various values of N in Fig. 1. Reasonably accurate results were obtained 
at the 107° point for N > 10. This empirical rate of convergence is also 
typical of distributions that have near-Gaussian statistics. The GQR 
algorithm, using the Cholesky decomposition described in the appen- 
dix, returned accurate results for all N <= 60, where N = 60 was the 
dimensionality limit in the computer program. 

In general, the GQr algorithm performs well for zero mean, symmet- 
ric distributions. To illustrate the problems that can occur with non- 
symmetric distributions, consider the lognormal distribution related to 
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the Gaussian distribution by Y = e~. Straightforward computation of 
the moments yields 


pix = exp(k?/2). (15) 


Using these moments in the Gqr algorithm, the algorithm breaks down 
at N = 17 because of roundoff errors in computation. This problem is 
solved by a transformation that symmetrizes the distribution. We then 
compute the Gar corresponding to the symmetrized distribution and 
take the inverse transform to obtain the original distribution. 

For the lognormal distribution, we form a new PDF 


Wel y) = 1/2[w(y) + w(—-y)], (16) 


which corresponds to the even part of w(y). The moments of w-(y) 
are obtained by setting the odd moments of w(y) equal to zero. The 
symmetric moments are then used in the GQR algorithm to obtain 
W.(y), which is easily transformed back to the desired CDF via the 
relation 


2W.(y) — 1 x=0 
0 


saa 0 a) 


Wy) = 
Experience with these procedures suggests that it is well worth the 
effort to transform distributions that are not symmetric and even (see 
Fig. 2). The modified moments produce the same robust accuracy seen 
with the Gaussian distribution above. 

As another example, consider a uniform distribution defined on the 
interval (—1, 1). The convergence of the GQr estimator Wy(a) is shown 
in Fig. 3. Since the distribution has only finite support, by eq. (1) we 
know that all the nodes will lie in the interval (—1, 1). In the limit as 
N — o, the nodes will become more densely packed in this interval 
and 


lim {max |¢;|} = 1. (18) 


Thus, the GaQr algorithm can be used to find the maximum value that 
a random variable attains, i.e., the largest node tmax. This can be used, 
for example, to find the maximum eye degradation in a digital regen- 
erator caused by correlated intersymbol interference. 

All the examples so far have been trivial applications since we knew 
the real distributions a priori. A more interesting application is deter- 
mining the distribution of the sum of K lognormal random variables. 
This problem has a long history and no closed form solution is known. 
This PDF is related to the distribution of crosstalk power in paired 
cable transmission systems and also results from transmission over 
certain types of fading channels. Utilizing the Gar technique, we can 
find the desired distribution if we can compute the necessary moments. 
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Fig. 2—Convergence of GQr estimator for lognormal PDF. 
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Fig. 3—Convergence of GQr estimator for uniform PDF. 
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Fig. 4—Sum of K lognormal (K = 1, 2, 4, 8). 


Assuming that the lognormal random variables are independent, and 
following Prabhu,’ we find that the moments of 


K 
Ve=) Yi 
i=1 
are given by the recurrence relation 


E{(V;)'] = Py (|) E[(Vx-1)"]ui-1, (19) 
where {u:}?, are the moments of the independent, identically distrib- 
uted lognormal random variables. Figure 4 shows the resulting distri- 
butions for K = 2, 4, 8, and 16. As we mentioned above, the distribution 
was symmetrized and inverse transformed to reduce the effects of 
roundoff errors. This technique can be applied to any number of 
arbitrary distributions for which the required moments can be com- 
puted. 


3.2 Known discrete random variable case 


In this section, we apply the Gar estimator Wi(a) to discrete 
distributions. First we consider the case of a mixed distribution com- 
posed of a Gaussian distribution plus discrete components. The weights 


2254 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1982 


0.16 X=X,+Xy 
X1=7(0, 0.1) 
+1 P=05 
0.14 - 
X2 (3 P=0.5 


0.12 


WEIGHTS 


0.06 


0.04 


0.02 








0 


-1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -06 -0.4 -02. 0 02 04 06 08 10 1.2 14 16 18 


NODES 


Fig. 5—Mixed Gaussian distribution. 


and nodes of the estimator are shown in Fig. 5, where the second 
moment of the discrete part equals ten times that of the Gaussian 
component. As we can readily discern, the Ga@r procedure is useful in 
identifying the discrete components of a PDF. 

As the final example of a known distribution, we consider the sum 
of nine equally spaced delta functions 


9 
w(x) = 1/9 ¥ d(x — xi) x= -5 +i t=1,2,---9. (20) 


The convergence of the GQr estimator Wi(x) is shown in Fig. 6, where 
the N = 9 estimator is exact since the distribution is uniquely defined 
by the first 2N = 18 moments. For N > 9, the algorithm breaks down. 


IV. SUMMARY 


An estimator based on GQr has been proposed, which converges 
rapidly to the cp¥F of a random variable and requires only knowledge 
of the moments of the random variable in question. The technique is 
generally applicable to a large class of communications problems and 
provides a practical solution to many analytically intractable problems. 
The technique works equally well for discrete and continuous distri- 
butions and assumes no a priori knowledge of the distribution. 
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Fig. 6—GQR for discrete distribution. 
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APPENDIX 
Details of the GQR Algorithm 


In this appendix, we outline the algorithm used to compute Gaussian 
Quadrature Rules. The procedure combines Gautchi’s modified mo- 
ment technique” with the Cholesky decomposition suggested by Mar- 
tin et al.’’ The resulting algorithm has been implemented using double 


precision arithmetic and has proven stable and robust. 


To compute the 2N unknowns {A;}%; and {t;}%1, we first form the 


matrix of modified moments 


M11 Myi,2°°* M1,N+1 
™M2,1 : 


m=| : eh 
™M2,N+1 *** MN+1,N+1 
where mj, is given by the inner product 
mi im (T;=1, T;j-1) 
b 
= | Ti-1(x) Ty-1(x)dW(x) i,7=1,2+--N41 


a 


(21) 


(22) 
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and {T;}o are the first N + 1 members of an arbitrary set of 
polynomials satisfying the recurrence relation 


xT (x) = ajTj4i(x) + 6jT (x) + cjTi-i(x) 7 =0,1,2,---N 
T_-i(x) = 0, aj #0. (23) 


The orthogonal Tchebycheff polynomials determined by (23) consti- 
tute a convenient choice, with 


a=1 
a=c=% J=1,2--- 
b; =0 JS, 2s, (24) 


The modified moments m, in (23) are simply linear combinations of 
the moments 


b 
y= | x/d W(x) 


and can be simplified for the case of the Tchebycheff polynomials by 
using the relation 


T;(x) T;(x) = Y{ Ti+;(x) + P2(x)} t=]. (25) 
Thus, if we define 


Vr = | Tr.(x)d W(x), 


then 
mi = 'h{visj-2+ Vi} L2S. (26) 


It is not necessary in theory for the 7;(x) to be orthogonal. The 
formulation by Golub and Welsch? used the unmodified moments 
corresponding to 7;,(x) = x” and, hence, a; = 1, b; = 0, and c; = 0 for all 
j. As Gautchi shows,’* the use of modified moments results in less 
sensitivity to computer roundoff errors. 

We next form the tridiagonal matrix 








Qa) Bi 0 
Bi a2 Be 
J= Bo + + ; (27) 
. an-1 Bn-1 
0 By-1 an 
where 
Tj, j+1 Vj-i,j 
(= b+ ;— a =1,2 N 
ay j rs aj ears aj-1 J 
faite, 7H oN Sd: 
;,j 
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The 7, are found from the relation 
M=R’R. (28) 


The matrix R is an upper triangular matrix and theoretically is 
positive definite if M is positive definite. In practice, however, M can 
be ill-conditioned and finite precision arithmetic will cause the matrix 
to appear singular. 

The elements of R are related to the moment matrix M by the 


relations 
i~1 1/2 
2 
ri = (ms = > rt) 


k=1 


i-1 
ry = (m. = rare) i rig U<J 
k=l 


ij=1,2---N. (29) 


In practice, the computation of R from (29) will fail at relatively small 
values of N when the square root of a negative number is attempted. 

A refined Cholesky decomposition” overcomes this problem by only 
requiring square roots to be computed at the end of the decomposition 
and not at each step as in (29). If we define R* by the relation 


R= R*diag(rii), 
then R* will be a unit upper triangular matrix and 
M=R'R 
= R*"diag(r3)R* 
= R*"DR*, (30) 


where D is a positive diagonal matrix. Then, defining the auxiliary 
quantities 


mi = rid;, (31) 
the following solution is obtained 
j-l 
mi=my— ¥ mirje J=1,2++-t-1 
=1 
i=1 
di = Mi — mir ir. (32) 


k=1 


The advantage of the alternate decomposition is that square roots 
are not required until the final step, when the positive diagonal matrix 
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Table I—Comparison of three implementations 


Alter- Alternate 

Stand- nate Cholesky 
ard Cho- Cho- Modified 
lesky lesky Moments 


Gaussian Random Variable 17 60 60 
Uniform Random Variable 13 40 38 
Lognormal Random Variable — 17 14 
Symmetrized, Lognormal Random Variable — 60 60 


D in (80) is factored. Along with the modified moment procedure, the 
alternate Cholesky decomposition yields accurate results even for large 
values of N. 

Several implementations of the Gar algorithm have been examined 
to elucidate the features that contribute to the reduction of computer 
roundoff errors. These include: 

(t) Standard Cholesky 

(iz) Alternate Cholesky 
(uz) Alternate Cholesky with modified moments 

(tv) All of the above using symmetrized moments. 
Each approach was evaluated in double precision arithmetic. 

The value of N at which the Cholesky decomposition fails was 
chosen as the measure of robustness for a variety of input probability 
density functions. Some of these results are tabulated in Table I. The 
standard Cholesky consistently had the poorest performance for all of 
the distributions considered. For symmetric distributions, the alternate 
Cholesky scheme provided a significant reduction of computer error. 
For the Gaussian distribution, the procedure was accurate for all 
N = 60, where 60 was the dimensionality limit imposed on the com- 
puter routine by storage requirements. The addition of the modified 
moment approach resulted in virtually no improvement relative to the 
alternate Cholesky implementation alone. None of the first three 
approaches proved satisfactory for nonsymmetric distributions (e.g., 
lognormal). The solution to this obstacle for one-sided distributions is 
to symmetrize the distribution according to (16), find the Gar estimate 
for the symmetrized distribution, and then obtain the desired distri- 
bution using (17). As we see in Table I, this renders the lognormal 
estimate as robust as the symmetric Gaussian distribution. 

The final step in obtaining the nodes and weights involves finding 
the eigenvalues and eigenvectors of the matrix J in (27). The eigen- 
vector q; corresponding to the eigenvalue ¢; is found from the equation 


Jq; = IQ; J=1,2,--- N. (33) 


The eigenvalues {t;}) are the nodes of the Gqr and the positive 
weights are given by 
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Fig. 7—Flowchart of GQR algorithm. 



























Aj = qijpo, (34) 
where 
Q} = (Qj, aj +++ GN,))- 


A flowchart of the steps used to compute GQR is shown in Fig. 7. 
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A 9.6-kb/s DSP Speech Coder 
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A digital speech coder has been designed for real-time operation 
at a data rate of 9.6 kb/s. The design is based on a combination of 
two speech compression techniques: Time-Domain Harmonic Scaling 
(tTpHS) and Sub-Band Coding (sBc). It is a highly modularized 
hardware implementation using five Bell Laboratories Digital Signal 
Processor (psP) integrated circuits as the key processing elements. 
Three psPs are used in the encoder for pitch detection, TDHS com- 
pression, and sub-band encoding. Another two DsPs are used in the 
receiver for sub-band decoding and TDHS expansion. In this paper we 
describe the overall design of the system and discuss some of the 
techniques used to realize it in the psp hardware in real time. General 
issues of the algorithm design, software implementation, and hard- 
ware design are considered. 


1. INTRODUCTION 


The subject of digital speech encoding and bit-rate compression has 
been one of considerable interest in recent years. Attention has focused 
strongly on bit rates in the range of 9.6 kb/s for applications where 
good “communications quality” is required and where robustness 
across a broad range of background noise conditions and speaker 
variations is necessary.’ This bit rate appears, at present, to be about 
the lowest practical rate at which this standard of quality and robust- 
ness can be reliably achieved. Below 9.6 kb/s presently known tech- 
niques have a noticeable synthetic quality and are considerably more 
fragile to differences in speakers and background conditions.’ 

Several encoding methods for achieving “communications quality” 
at 9.6 kb/s have been proposed and studied. Most of these methods 
involve a considerable amount of signal processing to meet these goals 
and are thus referred to as “high-complexity” algorithms. Their imple- 
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mentation in real time typically requires the use of specially designed 
high-speed digital hardware or array-processing digital computers.” 

In recent studies,” we found that a combined technique of sub-band 
coding (sBc) and time-domain harmonic scaling (TDHS) leads to an 
encoding approach whose quality is comparable to, or better than, 
some of the previously studied “high-complexity” algorithms. The 
interesting aspect of this approach is that it is a combination of two 
relatively “low-complexity” algorithms that are amenable to real-time 
implementation using currently available technology. Thus, this TDHS/ 
SBC approach appears to be an attractive, economical candidate for 
real-time implementation of 9.6-kb/s speech encoding using a rela- 
tively small amount of hardware. 

In this paper we discuss the design of a 9.6-kb/s speech coder based 
on the above TDHS/SBC approach. The design is highly modularized 
around the use of the Bell Laboratories Digital Signal Processor (DSP) 
integrated circuit.°° Three Dsps are used in the encoder for pitch 
detection, time-domain harmonic-scaling compression, and sub-band 
encoding, respectively. Another two DsPs are used in the receiver for 
synchronization, sub-band decoding, and time-domain harmonic-scal- 
ing expansion. Essentially all of the signal processing in the coder is 
performed by the DsPs with only a minimal amount of support hard- 
ware for interfacing, clock generation, and input/output (1/o) buffer- 
ing. 

This paper discusses aspects of the overall design and the design 
parameters of the coder. Other papers’” discuss in more detail the 
software implementation of the pitch, TDHs, and sBc algorithms on 
the psps and the architecture of the multiple-Dsp hardware used to 
implement the coder. 


ll. THE TDHS/SBC ALGORITHM 


The TDHS/SBC system is basically a cascade of two different speech- 
compression algorithms, TDHS and sBc. Figure 1 gives a basic block 
diagram of this approach. The sampled input signal s(n) is first 
compressed in bandwidth and sampling rate by the TDHS algorithm to 
form the intermediate signal s.(n). This processing is performed in a 
pitch-synchronous manner. Consequently, a pitch detector is required 
in the system. The sBc encoder digitally encodes the compressed 
signal, s.(m), to form the encoded data. This data, multiplexed with 
the encoded pitch and appropriate frame-synchronization information, 
forms a 9.6-kb/s bit-stream which is sent over the digital channel. In 
the receiver the digital signal is first synchronized into frames and 
demultiplexed into pitch and sBc data. The sBc data is decoded by the 
sBc decoder to form the intermediate signal, §.(m), which is a quantized 
version s,(m). It is then expanded back to its original bandwidth and 
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Fig. 1—Basic block diagram of the TDHS/sBC system. 
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sampling rate by a TDHS expansion algorithm to produce the signal 
§(n), which is a decoded replica of the input s(n). 

The two algorithms operate on different properties of redundancy 
of the speech signal to achieve their compression. The TDHS algorithm 
takes advantage of the pitch structure or pseudoperiodic nature of 
speech through a pitch synchronous process.’° It effectively interpo- 
lates two pitch periods of signal into one period to achieve a signal- 
compression factor of two. Further details of this approach are re- 
viewed in Section V. 

The ssc algorithm is a waveform coding technique that achieves a 
bit-rate compression by adaptively quantizing the speech in frequency 
bands. It takes advantage of the properties of temporal nonstationarity, 
spectral-formant structure, and auditory masking in speech production 
and perception.!*"' Further details of this method are outlined in 
Section VI. 

Because the TDHS and SBC algorithms operate on different properties 
of redundancy of speech, they are highly complementary and 
“noncompetitive” in their operation. Thus, the overall compression of 
the cascaded system is effectively the product of the two individual 
compression factors. 

A second advantage of this cascaded approach is that the degrada- 
tions introduced by the two compression techniques are perceptually 
different. Degradations introduced by TDHS compression appear as a 
form of reverberance, whereas degradations introduced by sBC coding 
appear in the form of quantizing noise and intermodulation distortion. 
Since these degradations combine in different perceptual 
“dimensions,” the overall perceived degradation tends to be less objec- 


SPEECH CODER 2265 


ADDRESS BUS 


ADDRESS 
REGISTERS 


ADDRESS 
MODIFICATION 


PROGRAM DATA 
MEMORY MEMORY 


EXTERNAL 
DATA BUS 


DATA BUS 





CU CLOCK 


py 
CONTROL 
SERIAL OUTPUT i SELECT AND 


TIMING RESET 
SERIAL INPUT F 





ADDER 
ACCUMULATOR 


Fig. 2—Block diagram of the psp. 


tionable than if they were combined along the same perceptual 
“dimension.” 

Finally, a third desirable feature of this algorithm is that the com- 
pression introduced by the TDHS algorithm allows the sBc algorithm 
to be computed at effectively one-half the computation rate of that 
which would be required with an uncompressed signal. Thus, it leads 
to a system that is efficient computationally, as well as one that can be 
modularized into a system of smaller algorithmic units. 


lll. BASIC HARDWARE CONFIGURATIONS 


The hardware design is highly modularized using the DsP as the key 
processing element in each module. Therefore, it is useful to review 
the basic characteristics of this device. Figure 2 shows a basic block 
diagram of the psp.” The main elements are: (i) a 1024-word X 16-bit 
read-only memory (Rom) for instruction and coefficient storage, (ii) a 
128-word X 20-bit random-access memory (RAM) for variable data 
storage, (liz) an address arithmetic unit (AAU) with address registers 
for controlling memory access, (iv) a data arithmetic unit (AU) with 
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Fig. 3—Block diagram of the hardware configuration for the 9.6-kb/s coder. 


provision for multiplication, full product accumulation, rounding, and 
overflow protection, (v) an 1/0 unit to control serial data transmission 
in and out of the circuit, and (vi) a control unit that provides instruc- 
tion decoding and processor synchronization. 

The processor operates with an 800-ns machine cycle time, which is 
established by a 5-MHz clock. In one machine cycle it can: (1) decode 
an instruction, (ii) fetch data and perform a 16 X 20-bit multiplication, 
(iii) accumulate the output products from the multiplier, and (iv) store 
data in memory. 

Figure 3 is a simplified block diagram of the hardware architecture 
for the 9.6-kb/s coder. It consists of cascaded connections of DSP 
modules with data passing from one DsP to the next in multiplexed 
form. The encoder contains three DsPps in series with an analog-to- 
digital (A/D) converter at the input and a first-in-first-out (FIFO) buffer 
at the output to the channel. The decoder contains two DsPs with a 
FIFO buffer between the second DsP and the output to the digital-to- 
analog (D/A) converter. In addition, a logic circuit is necessary at the 
receiver between the input of the 9.6-kb/s channel and the first DsP 
for use in bit slipping for synchronization on startup. 

The analog input, s(t), is first converted to digital form s(n) by a p- 
law A/D converter.” An 8-KHz clock signal for the sampling rate of 
the A/D converter is generated from the 9.6-KHz channel clock using 
a phase-locked loop (PLL) circuit. The first DSP is used to implement 
the pitch detector. It passes the p-law signal s(n) through to the second 
psP along with multiplexed pitch information. The second psp is used 
for the TDHS compression algorithm. Its output consists of the com- 
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pressed signal, s.(7), multiplexed with pitch and synchronization infor- 
mation. The third DspP is used for the sBc encoder and for multiplexing 
the pitch, synchronization, and sBc-encoded data into a 9.6-kb/s serial 
bit stream. Its output is passed to the channel in the form of 16-bit 
serial words. A FIFO memory is used to control the flow of data between 
the psp and the channel. 

In the decoder the fourth DsP receives the serial bit stream directly 
from the 9.6-kb/s channel in the form of 16-bit words. A bit-slipping 
logic circuit is used to interface the channel to the DspP; it is controlled 
by a flag from the psp. The circuit is used to align the first bit of a 
frame of data to the first bit of a 16-bit word in the synchronization 
locking mode of the decoder. Once frame synchronization is estab- 
lished, the fourth psp is used to demultiplex the pitch and spc data 
and perform the computation for the spc decoder. The spc decoded 
output, §.(m), and the pitch are then passed to the fifth psp, which 
computes the TDHs expansion algorithm. The output, $(n), of this DSP 
is the decoded version of s(n) and it is passed to the D/A (in p-law PCM 
format) through a second FIFO buffer. 

In the next five sections we discuss each of the components of this 
system in more detail. Section IV discusses the design of the pitch 
detector, Section V discusses the TDHS algorithms, and Section VI 
discusses the design of the sBc. The overall framing and multiplexing 
structure for the coder is discussed in Section VII, and Section VIII 
discusses the synchronization detection algorithm used for frame align- 
ment in the receiver. Finally, Section [X discusses the performance of 
the system. 


IV. THE PITCH DETECTOR 


The pitch detector is based on a modification of the autocorrelation- 
type pitch-detection algorithm,” and it is designed for implementation 
in a single psp.’ Figure 4 illustrates this basic approach. The input 
speech signal s(7) is first converted to a linear PcM signal and lowpass 
filtered to remove spectral energy above 1 KHz. The resulting low-pass 
filtered signal, x(n), is then used to compute the autocorrelation- 
function estimate at time n 


co 


ra(m) = Y f(n— ¢)x(¢)x(¢— m), (1) 
¢{ =—00 
where m denotes the autocorrelation lag and f(n) corresponds to the 
analysis window over which r,(m) is computed. The pitch period is 
then defined as the lag mp over the range of the allowed set of lag 
values {m} for which F,,(m) is maximum, i.e., 


Fn(mo) = max[7n(m) -g(m)] (2a) 
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Fig. 4—Block diagram of the pitch-detector implementation. 


pitch = Pr = mo, (2b) 


where g(m) is a weighting factor used to control the behavior of the 
algorithm.’ 

A total of 64 autocorrelation coefficients are included in the set 
{r,(m)} corresponding to values 


{m} = (25, 26, 27, --- 55, 56, 58, 60, 62, --- 116, 118,120}. (3) 


This allows the pitch period to be encoded into a 6-bit word for 
transmission over the channel. With a sampling rate of 8 KHz, the 
pitch values in the set {mm} correspond to a pitch frequency range from 
66.7 Hz to 320 Hz, which spans the range of most speakers. Note also 
from eq. (3) that the allowed values of pitch period are more closely 
spaced (quantized) for low values, m, than for high values. This allows 
for a more uniform percentage of accuracy of pitch for high- and low- 
pitch speakers. 

The autocorrelation estimate r,(m) is computed using an exponential 
window function 
y"n=0 


On<0. (4) 


f(n) = 

This forms allows r,(m) to be sequentially updated according to the 
relation 

rn(m) = yrn-1(m) + x(n)x(n — m). (5) 

In this manner, only two multiplications and one addition are required 


to update each autocorrelation coefficient. 
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To further reduce the amount of computation and storage required 
in the psp, eq. (5) was modified so that r,(m) is updated only every 
fourth sample, i.e., 


Tr(m) = ¥Tn-s(m) + x(n)x(n — m), (6) 


where y typically has a value of y’ = 0.95. This reduces the total 
amount of computation by a factor of 4. It also reduces storage 
requirements for the delayed signal x(n — m) by allowing x(n) to be 
decimated (reduced in sampling rate) by a factor of 4. Thus, at any 
one sample time only certain values of r,,(m) can be updated. Compu- 
tation is therefore distributed over a four-cycle process in which four 
input samples s(n) are received, all values of r,(m) are updated, and a 
new estimate of pitch is determined every four sample times. A more 
detailed description of this computational structure and the manner in 
which it is implemented in the psp is described in Ref. 7. Further 
details concerning the weighting parameters g(m) and the performance 
of this design are also available in the same reference. 

The speech signal s(n) is passed from the pitch detector to the TDHS 
algorithm along with the 6-bit encoded pitch information, which is 
inserted after every fourth speech sample. Since the TDHs algorithm 
does not require information regarding the voiced or unvoiced nature 
of the speech, no voiced/unvoiced decision is made by the pitch 
detector. During voiced regions, the pitch detector measures the speech 
periodicity, and during the unvoiced and silence regions it gives the 
best estimate of any long-term correlation that may exist in the signal, 
even though this correlation may be low. 


V. THE TDHS ALGORITHM 


The TDHS algorithm compresses the input signal s(n) by a factor of 
two such that the compressed signal s,(7) contains one-half of the 
original number of samples.”*’° Since the sampling rate of s(n) is 8 
KHz, the sampling rate of s,.() is, therefore, 4 KHz. This compression 
process can be interpreted in the frequency domain in terms of a 2:1 
compression of the spectral bandwidth such that the original 0- to 4- 
KHz bandwidth of the signal s(n) is scaled to a 0- to 2-KHz bandwidth 
allowing the sampling rate to be reduced by the factor of two. The 
compression is achieved by reducing the frequency spacing between 
pitch harmonics and appropriately scaling the envelope of the spec- 
trum by a factor of two. 

In an alternative time-domain interpretation, the signal s(n) is 
compressed by a factor of two by computing one pitch period of the 
compressed signal s,(n) from a weighted average of every two pitch 
periods of the input signal s(n) in a pitch-synchronous manner. The 
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s(n) 





s,(n) 


Fig. 5—TDHS compression. 


TDHS algorithm is implemented according to this interpretation. Figure 
5 illustrates this interpolation process for the compression algorithm. 
Given the pitch period, p = pn, from the pitch detector, the input 
speech s(n) is divided into blocks of 2p samples. One block of p 
samples of compressed signal s,(n) is then computed from these 2p 
samples according to the following process. The first block of p samples 
of s(n) is weighted by a p-sample window, w(m), m = 0,1, --- p—1, 
which linearly decreases from a value of 1 to 0 across the block. The 
second block of p samples of s(7) is similarly weighted with a window 
1 — w(m) that linearly increases from 0 to 1 across the block. The sum 
of the two weighted blocks then produces one block of p samples of 
the compressed signal s,(n), as illustrated in Fig. 5. The waveform of 
s-(n), therefore, looks mostly like the first block of s(n) at its beginning 
and mostly like the second block of s(n) at the end. In this way the 
concatenation of the blocks of s.(n) forms a continuous waveform 
without end effects from block to block. The next block of s.(n) is 
computed in the same manner as above using the next 2p = 2p, 
samples of s(n). 
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Fig. 6—TDHS expansion. 


In the TDHS expansion of §.(m) to §(n) a similar pitch-synchronous 
interpolation is performed. Figure 6 illustrates this process. In this 
case, 3p samples of §.(n) are used to compute 2p samples of $(n) using 
the 2p-sample overlapped windows shown by the solid lines in Fig. 6. 
The windows are then moved over by p samples, as shown by the 
dashed lines in Fig. 6, and the next 2p samples of §(m) are computed in 
a similar process. Thus, for every p samples of new input signal, .(7), 
2p samples of the expanded signal §(n) are computed. A careful 
analysis reveals that this process results in an output waveform §(n) 
that is continuous across the concatenated output blocks without end 
effects. 

Although the TDHS compression and expansion algorithms have 
been discussed above in terms of block-processing operations, they are 
more conveniently implemented in the DsP in a stream-processing 
manner.® That is, for every two input samples of s(m) in the TDHS 
compression, one sample of s,(m) is computed. Similarly, for every 
input sample of §,.(m) in the TDHS expansion, two outputs of $(n) are 
computed. These operations are performed using the structure shown 
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Fig. 7—Block diagram of the implementation of TpHs compression (or expansion). 


in Fig. 7. A length 2P shift register is used to hold the input data s(n) 
(for compression) or §,(n) (for expansion), where 


P = max pn = 120. (7) 


Input samples enter from the right and are shifted one sample to the 
left for each input. The computation of each output sample involves 
the multiplication of samples located at indices 7 and 7 in the shift 
register with the respective window coefficients w(m) and 1 — w(m), 
as illustrated in Fig. 7. The sum of these two products is then the 
interpolated output sample. It also should be clear from the above 
discussion that the indices i and 7 are always spaced p samples apart. 

For TDHS compression the data in the shift register moves two 
samples to the left and the indices i and 7 are decremented (shifted 
left) by one sample for each output. At the start of a new block of p 
samples, index 7 is initialized to the center of the shift register, p is 
initialized to pn, J is initialized to i + pn, and w(m) is initialized to w(0) 
= 1. The process is then repeated for the next p samples. 

For TDHS expansion the opposite process occurs. Since the output 
samples are generated at twice the rate of the input samples, the 
indices i and j must be incremented (shifted right) by 1 after each 
output sample. At the start of a new block, index 7 is initialized to the 
center of the shift register and 2 is initialized to 7 = p. Further details 
on how these operations are implemented in the DsP are covered in 
another paper.® 
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Fig. 8—Block diagram of the sBc encoder and decoder. 


Another novel aspect of the above DsP implementation of TDHS 
compression and expansion algorithms involves the amount of memory 
required for the shift register. From eq. (7) and Fig. 7 we can see that 
the required length of the shift register is 2P = 240 samples, whereas 
the DsP contains only 128 locations of RAM. This conflict is conveniently 
solved by using the capability of the DsP to convert between 8-bit' p- 
law and 20-bit linear pcm formats. By storing the data in the shift 
register in 8-bit u-law format and packing two 8-bit words into each 
20-bit RAM location, the effective memory of the DsP is doubled, 
allowing the TDHS algorithms to be completely implemented within 
individual DsPs without external memory. Also, by carefully organizing 
the “high” and “low” 8-bit words of memory it is possible in the TDHS 
algorithms to associate “high” memory with one of the indices 7 or 7 
and “low” memory with the other. This greatly simplifies the software 
implementation.® 


VI. THE SUB-BAND CODER 


The sub-band coder (sBc) encodes the compressed signal s,(7) into 
a digital bit stream. It is a waveform coding technique that takes 
advantage of the temporal and spectral properties of speech production 
and speech perception by partitioning the signal into a set of sub- 
bands by a filter bank (see Refs. 1, 9, 11, and 14). Each sub-band is 
effectively bandpass filtered, low-pass translated to dc, sampled at its 
Nyquist rate (twice the width of the sub-band), and then digitally 
encoded using adaptive differential PcmM (ADPCM). Figure 8 illustrates 
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Table I—Sub-band coder design 








Frequency Range (Hz) Sampling Bits/ 
Band Uncompressed Compressed Rate (Hz) Sample Bit Rate 
1 0-500 0-250 500 5 2500 
2 500-1000 250-500 500 4 2000 
3 1000-2000 500-1000 1000 2 2000 
4 2000-3000 1000-1500 1000 2 2000 
TOTAL 0-3000 8500 





a simplified block diagram of this process. In the receiver the digital 
signal §.() is reconstructed by decoding the sub-band signals, inter- 
polating them, translating them back to their original spectral loca- 
tions, and then summing them to form the decoded signal §.(n). 

The sub-band framework offers several advantages. Quantization 
noise is contained in bands to prevent masking of one frequency band 
by quantizing noise in another frequency band. Separate adaptive 
quantizer step sizes are used so that bands with lower signal energy 
have lower quantizer step sizes and contribute less quantization noise. 
By appropriately allocating bits in different bands, the shape of the 
quantization noise can be controlled in frequency. In the lower fre- 
quency bands, where pitch and formant structure must be accurately 
preserved, a larger number of bits/sample are used, whereas in upper 
frequency bands, where fricative and noise-like sounds occur in speech, 
fewer bits/sample are used. 

A four-band spc design is used in the 9.6-kb/s coder. Table I 
summarizes the choice of bands, sampling rates, and bits/sample used 
in this design. Bandwidths are given with respect to both the uncom- 
pressed and compressed frequency scales. Therefore, the total band- 
width of the coder is 3 KHz, relative to the uncompressed frequency 
scale. The total bit rate of the sBc coder is 8.5 kb/s, leaving the 
remaining 1.1 kb/s for transmission of pitch and framing information. 

Figure 9a illustrates the manner in which the sBc analysis filter 
bank is implemented. The design is based on the use of quadrature 
mirror filter (@MF) designs,®’” which are implemented in terms of 
polyphase structures.’® The qmF approach allows a signal band to be 
divided into two equally spaced, high-pass filtered (HPF) and low-pass 
filtered (LPF) sub-bands, which are each reduced in sampling rate by 
a factor of two. This process is accomplished with a pair of symmetric 
finite impulse response (FIR) high-pass and low-pass filters. Because of 
the symmetry and the quadrature mirror relationship of these two 
filters, their coefficients are identical except for the signs of the odd- 
numbered coefficients. This property allows the computation to be 
shared between the two filters by separately computing the even taps, 
h.(n), and the odd taps, ho(7), of the low-pass filter A(n).*’° The sum 
of these two partial computations gives the output for the lower band 
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Fig. 9 (b)—gmF filter bank tree structure for synthesis. 


Table !l—Filter coefficients for h(n), g(n), and f(n) 


n - h(n) g(n) and f(n) 
0 . 0.0006506 0.0065257 
1 —0.0013508 —0.0204875 
2 —0.0012601 0.0019911 
3 0.0041581 0.0464768 
4 0.0014272 —0.0262756 
5 —0.0093636 —0.0992955 
6 —0.0001722 0.1178666 
7 0.0178820 0.4721122 
8 —0.0041094 
9 —0.0311553 

10 0.0144688 

11 0.0529093 

12 —0.0392449 

13 —0.0998001 

14 0.1284651 

15 0.4664583 


and the difference gives the output for the upper band. By applying 
this two-band splitting process in the tree structure in Fig. 9a, the band 
structure in Table I is obtained where the highest band (band 5) is 
discarded in the process. In the receiver a similar tree structure (Fig. 
9b) is used in reverse order to recombine the sub-bands into a full- 
band signal. The delays in bands 3 and 4 in Fig. 9 are used to 
compensate for the processing delay caused by the extra split used to 
generate bands 1 and 2. 

The above anmF filter-bank design offers two advantages. First, it 
leads to an efficient means of computing the filters by sharing com- 
putation among bands. Second, because of the quadrature nature of 
the design, aliasing terms, which are generated in the process of 
sampling rate reduction in the encoder, are canceled by imaging terms, 
which are generated in the filter-bank interpolation process in the 
receiver. This property allows the use of relatively low-order finite 
impulse response (FIR) filter designs in the coder. 

The low-pass filter design, h(n), for the first stage of QMF splitting is 
accomplished with a 32-tap FIR filter. The coefficients for this design’ 
are given in Table II, column 2. Note that only the first symmetric 
half of the coefficients is given. The remaining coefficients can be 
obtained from the relation h(n) = h(31 — n) for n = 16 to 31. The filter 
designs g(n) and f(n) for the second and third stages of QMF splitting 
are accomplished with identical 16-tap F1R filters. Table II, column 3, 
gives the coefficients for this design. Again, only the first symmetric 
half of the coefficients is given and coefficients from n = 8 to 15 can be 
obtained from the relation g(n) = g(15 — n). All coefficient values are 
quantized to the 16-bit accuracy of the DspP in the actual implementa- 
tion. 
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Fig. 10—Frequency response for (a) the compositive five-band filter bank, and (d) 
the analysis/synthesis reconstruction based on the filter bank. 


The frequency response for the combined filter bank is shown in 
Fig. 10a, where the band structure given in Table I is clearly apparent. 
Note also that because the filters in the second and third stages of the 
structures in Fig. 9 are implemented at lower sampling rates, their 
frequency responses are respectively scaled to narrower widths in the 
combined system. Finally, Fig. 10b shows a plot of the total frequency 
response of the back-to-back filter bank (without ADPCM coding). A 
total reconstruction error of less than 0.25 dB is observed. 

The appPcM coders for quantizing the sub-band signals are identical 
to the psp design discussed in Ref. 18, except for the choice of design 
parameters. Therefore, we will only briefly review the design and 
discuss the choice of parameters. Figure 11 is a block diagram of this 
design. The main elements of the coder are: (z) a b-bit PCM quantizer 
(where 6 can be varied from 2 to 5), (iz) tables to store the step size 
and inverse step size, (iii) a step-size adaptation circuit to control the 
choice of step size (i.e., the addresses to the tables), and (zv) a first- 
order fixed predictor. 

A predictor signal p(n) is first computed by scaling the previous 
decoded sub-band signal §.(n — 1) by the predictor coefficient 8, where 
for telephone speech, values of £ will be close to zero (e.g., 0, —0.4, 0, 
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Fig. 11 (a)—Block diagram of the aDPcm encoder. 
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Fig. 11 (b)—Block diagram of the ADPcM decoder. 


0 for bands 1 to 4"'). The signal p(n) is then subtracted from the sub- 
band signal s,(n) to produce the difference e(n), which is adaptively 
quantized. This is accomplished by first scaling e(n) down by the 
inverse step size V(n) and limiting and rounding it (pcm coding) to the 
b-bit integer I(n). By adding the value 2° to I(n) a positive b-bit code 
word, J(n), is obtained, which can be conveniently multiplexed with 
code words from other sub-bands (as well as pitch and synchronization 
information). 

The b-bit integer, J(n), is also used in the encoder and decoder for 
obtaining the decoded output §.(m) and for step-size control. Decoding 
is accomplished by adding 0.5 to J(n) and scaling it back up by the 
step size A(n). By adding the predictor value p(n) to this quantized 
difference we obtain the decoded signal §.(n), which is the output of 
the decoder. It is also used in both the encoder and decoder for 
generating the next predictor value. 

The step size A(7) and its inverse V(n) are determined by the table 
address, which is adaptively incremented or decremented by the step- 
size control. The size of the tables are 64 words and they span a 60-dB 
step-size range (i.e., 0.9375 dB/word). Their address is obtained from 
the integer part of the variable d(n) (see Fig. 11), which is limited to 
the range —32 to +31. The value of d(n) is stored in a “leaky integrator” 
with a leak factor of 0.98. In this way d(n) “drifts” toward zero (the 
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Table IIIl—Quantizer adaptation parameters 
Attack Rate Decay Rate 


No. of Bits m m2 (dB/sample) (dB/sample) 
2 4.88 —0.98 4.6 —0.92 
3 5.47 —1.27 5.1 —1.2 
4 5.47 —1.27 5.1 —1.2 
5 5.47 —1.27 5.1 —1.2 


center of the tables) and mitigates the effects of channel error. If an 
outermost (positive or negative) quantizer level is used, a positive 
value, m[I(n)] = m, is added to d(n), and if an innermost (positive or 
negative) level of the quantizer is used, a negative value, m[I(n)] = mz 
is added to d(n). In this way the step size is dynamically varied to 
match the range of e,(n) to the range of the pcm quantizer. Table III 
shows typical values of m; and mz for b = 2 to b = 5 bit quantizers in 
SBC and their corresponding “attack” and “decay” rates (in dB/sample) 
at which they can respectively expand or contract their step sizes. 

The manner in which the computation in the SBC is performed is 
strongly determined by the three-stage framework of the filter bank. 
Since the sampling rate is reduced by a factor of two at each stage, 
computation in different stages must be performed at different rates. 
This is accomplished by distributing the total computation of the 
coder (and decoder) over an 8-cycle process. In each cycle one input 
sample s,(7) is received in the coder, one path of the filter band “tree” 
is computed, and one ADPCM coder is implemented. The cycle numbers 
n= 0,1, 2, --- 7 in Fig. 9 indicate the paths of computation taken in 
each cycle. For example, in cycle 0 of the sBc encoder, filters ho(7), 
o(n), and fo(n) in the uppermost branch of the tree are updated and 
the ADPCM coder in band 1 is computed. As can be seen from this 
structure, stages and coders in the tree that have higher sampling rates 
are computed more often and those with lower sampling rates are 
computed less often. 

The tree structure also determines the manner in which output bits 
from the ADPCM coders are multiplexed and framed for transmission. 
Table IV lists the cycle number and the sub-band that is coded in each 
cycle. After eight cycles of processing, eight input samples are received 
and 17 output bits are generated. The process is then repeated. 


Vil. MULTIPLEXING AND FRAMING OF THE DATA 


The 9.6-kb/s data from the TDHS/SBC coder is multiplexed into 
frames of 96 bits each for serial transmission over the channel. Each 
frame corresponds to 10 ms of encoded speech. The multiplexing is 
performed in the third psp along with the multiplexing of the sub-band 
coder data. The output from this DsP is in the form of 16-bit words 
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Table !V—Order of computation in the sub-band coder 


Cycle Band Bits 
0 1 5 
1 4 2 
2 3 2 
3 (5) _ 
4 2 4 
5 4 2 
6 3 2 
7 (5) _— 
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Fig. 12—Frame structure for the 9.6-kb/s bit stream. 


such that six words form one 96-bit frame. Figure 12 summarizes this 
multiplexing structure for one frame of data. 

The first five bits in the frame represent a synchronization header, 
which is used for frame locking in the receiver. The next six bits are 
used to transmit pitch from the pitch detector. The remaining bits are 
used for the transmission of five groups of 17 bits each of sBc data, 
where each group comprises one 8-cycle computational loop in the SBC 
coder. From Fig. 12 we can see that one 16-bit word can be obtained 
from the psp after encoding band 1 in each 8-cycle loop of the sBc 
coder, and a final 16-bit word can be transmitted at the end of the 
frame. In the receiver a similar decoding process is performed in the 
demultiplexer of the sBc decoder. 


Vill. SYNCHRONIZATION DETECTION AND DATA ALIGNMENT 


Before the data in the receiver can be decoded, the framing structure 
in the bit stream must first be identified and the data must be aligned 
with the 16-bit input words according to Fig. 12. This synchronization 
detection and data alignment is accomplished in the first psp of the 
receiver with the aid of the bit-slipping logic at the input from the 
channel (see Fig. 3). 

Figure 13 shows a block diagram of this process. The algorithm is 
divided into two modes of operation, a synchronization search mode 
and a run mode. In the search mode the DsP receives input from the 
channel in the form of 16-bit words. If the first five bits match the bit 
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Fig. 13—-Block diagram of the synchronization detection and data alignment algo- 
rithm. 


pattern of the header, it is assumed that the first word of a frame has 
been found. It then discards the next five words of the frame and 
proceeds to the run mode. 

If no match is found, it waits for the next 16-bit input and tries 
again. If it is unsuccessful in finding a header after seven trials it 
assumes that the frame structure is out of alignment with the word 
structure. It then sends a flag to the bit-slipping logic, which discards 
one bit in the bit stream. The search process is then repeated. A total 
of seven trials are used to avoid the possibility of skipping a bit in the 
middle of the header. 

Once the algorithm is in run mode it continues to check the header 
at the beginning of each frame. If a match is found in one or more of 
the past three frames it assumes that the algorithm is synchronized 


2284 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1982 


and proceeds to process the frame. In this way the algorithm is more 
resistant to channel errors that may occur in the header. If no match 
is found after three consecutive frames, it assumes that the system is 
out of synchronization (or had received a false start) and it returns to 
the search mode. 

The above algorithm works well and uses a minimum amount of 
code in the psp. The bit pattern for the header was chosen to be the 
5-bit sequence 10010. 


IX. PERFORMANCE 


In terms of quality, the real-time DSP version of the TDHS/SBC coder 
matches the quality of the computer simulations.” The degradations 
introduced by TDHS compression/expansion are generally perceived as 
a form of reverberance, and degradations introduced by the sBc coder 
are generally perceived as a form of quantization noise and harmonic 
distortion. 

In a recent study involving TDHS/SsBC coding of telephone network 
speech it was found that the perceived reverberance caused by the 
TDHS processing was more noticeable than that for high-quality micro- 
phone speech.” This implies that some caution must be exercised in 
applications involving a direct tandem connection of TDHS/SsBC with 
the telephone network environment. Later we will show how this effect 
can be mitigated to some extent. For applications where the charac- 
teristics of the transducer can be controlled, this poses no problem. 

The reason for the increased reverberance for telephone network 
speech was found to be a consequence of the strong pre-emphasis 
caused by the 500-type telephone set specifications and the tight band- 
pass filtering (200 to 3200) of the D-channel bank. Figure 14 shows the 
resulting frequency response of the network environment.” This can 
be compared to the flat response of a high-quality microphone. 

The network frequency response of Fig. 14 has two effects on the 
perceived quality of TDHS processing. Both effects stem from the fact 
that TDHS depends strongly on an accurate pitch measurement for its 
performance, as well as the assumption that voiced speech can be 
modeled as a pseudoperiodic signal. For example, if Af represents the 
error in the measurement of the fundamental pitch harmonic (caused 
by measurement error or quantization of the pitch) the nth harmonic 
will have an error of nAf. This means that TDHS processing will always 
degrade the high-frequency regions of speech more than the low- 
frequency regions. For high-quality microphone speech this high-fre- 
quency reverberance is not very apparent because for voiced sounds 
the high-frequency regions are relatively low in amplitude. In addition, 
the strong energy in the first formant helps to perceptually mask any 
degradations in the high-frequency regions. During unvoiced regions 
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Fig. 14—Frequency response of the telephone network environment. 


the noise-like character of speech tends to mask most of the effects of 
TDHS processing. 

The effect of the telephone pre-emphasis in Fig. 14 is to amplify the 
high-frequency region of the speech by a factor of 15 to 20 dB relative 
to the low-frequency region. This also amplifies the high-frequency 
reverberance of TDHS as well. The effect of the high-pass filtering from 
0 to 200 Hz effectively removes most of the fundamental harmonic of 
the speech and consequently reduces the masking process. Thus, both 
effects contribute to enhancing the perceived reverberance of the TDHS 
processing. Therefore, the problem is one of auditory perception rather 
than the TDHs algorithm performing differently on telephone or micro- 
phone speech. We found that if high-quality microphone speech is 
filtered with the same filter response as that of Fig. 14, either before or 
after TDHS/SBC coding, we get the same effect as with telephone 
network speech. 

The effects of the pre-emphasis in Fig. 14 on TDHS processing can be 
mitigated to some degree by undoing some of the pre-emphasis. It 
cannot restore the fundamental harmonic, however, which is removed 
by 0- to 200-Hz high-pass filtering. Thus, the perceived reverberance 
can be reduced in amplitude (along with the amplitude of the high- 
frequency content of the speech) but the masking effects caused by 
the fundamental harmonic cannot be restored. The deemphasis filter 
used in the psp was a simple first-order filter with the difference 
equation 


y(n) = 0.65y(n — 1) + 0.667x(n) + 0.25x(n — 1). (8) 
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This filter can be inserted or removed in the psp realization by 
controlling one of the input flags to the Dsp with a switch. 

A final observation that we have made is that the TDHS/SBC coder 
appears to perform slightly better with electret microphone speech 
than with carbon-button microphone speech. We speculate that this is 
due to the increased harmonic distortion of the carbon-button micro- 
phone over that of the electret. 


X. CONCLUSIONS 


In this paper we have discussed the overall aspects of the design of 
the 9.6-kb/s TDHS/SBC speech coder. An attractive feature of the 
design is that it can be broken down into a set of five highly modular- 
ized parts, each of which is implemented in a single DsP and effectively 
fully utilizes the capabilities of the psp. It therefore leads to a relatively 
efficient and low-complexity approach to a 9.6-kb/s coder. It offers 
good “communications” quality that is competitive with that of other 
higher complexity systems. 

The overall processing delay of the coder and decoder is about 80 
ms or about 160 ms for a full-duplex system. This amount of delay is 
fairly typical compared with other competitive techniques at this bit 
rate. 
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Current models for isolated word recognition perform very well on 
small vocabularies of distinctly different sounding words. However, 
when we are confronted with vocabularies of similar sounding words 
(e.g., the letters of the alphabet), the performance of isolated word 
recognizers decreases dramatically. By carefully reexamining the 
model used for isolated word recognition we have identified some of 
the inherent deficiencies. In this paper we propose an improved word- 
recognition model that is inherently capable of accurately recogniz- 
ing words from almost any vocabulary. We have investigated a simple 
implementation of the model that preserves most of the structure of a 
linear predictive coding (LPC)-based version of the canonic isolated 
word model. In an experimental evaluation of the improved model, 
using an alpha-digit vocabulary, recognition accuracy improvements 
of from 1 to 5.7 percent were obtained for four talkers. The improve- 
ments were due to changes in both the analysis model and the 
decision procedure. The strengths and weaknesses of the improved 
model are discussed. 


1. INTRODUCTION 


Although the goal of continuous speech recognition by machine 
remains far out of reach, the one area of speech recognition that is 
practical with today’s technology and understanding is that of isolated 
word recognition.’ ° What is interesting about this area is that the 
general approach used to solve the isolated word-recognition problem 
(i.e., the statistical-pattern-recognition approach) bears little relation- 
ship to the way in which humans understand speech. As a result the 
vocabularies for which the isolated word recognizers can achieve good 
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performance are severely constrained in both size and complexity.’ If 
we are interested in using a vocabulary for which the performance of 
the isolated word recognizer is less than perfect (e.g., the letters of the 
alphabet), then we have to rely on the syntax and semantics of the 
recognition task to provide the desired level of performance from the 
overall system.>""° 

In an effort to improve word-recognition accuracy for arbitrary 
vocabularies, we have re-examined the word-recognition model and 
proposed a somewhat more general structure. The proposed changes 
in the model include an improved feature analysis in which both long- 
time and short-time features are measured, and an improved decision 
box in which the two-pass decision rule of Rabiner and Wilpon"? is 
adapted to the speaker-trained case. 

The implementation of the improved word-recognition model, which 
we have studied, is based on the standard linear predictive coding 
(LPC) word recognizer as originally proposed by Itakura.” In an effort 
to retain as much of the original structure as possible, we have used 
the standard LPc analysis as the long-time features, and a new LPC 
analysis based on 15-ms frames as the short-time features. Experimen- 
tation with the improved model, using a 39-word vocabulary of the 
alphabet, the digits, and three command words in a speaker-trained 
mode, showed improvements in accuracy of from 1 to 5.7 percent for 
four talkers. An analysis of the results showed that the improved 
feature analysis provided only small improvements in accuracy (from 
0 to 1.3 percent), whereas the two-pass decision rule provided some- 
what larger improvements in accuracy (from 1 to 4.4 percent). 

The outline of this paper is as follows. In Section II we briefly review 
the canonic isolated word-recognition model and discuss its strengths 
and weaknesses. We also discuss, in this section, the implementation 
of the model based on Lpc feature analysis and an LPC distance 
measure. In Section III we present the improved word-recognition 
model and discuss how it was implemented within the structure of the 
LPC-based recognizer. In Section IV we describe the experimental 
evaluation of the improved model based on the alpha-digit vocabulary. 
Finally, in Section V we discuss the results and their implications for 
practical systems. 


ll. THE CANONIC MODEL FOR ISOLATED WORD RECOGNITION 


Figure 1 is a block diagram of the canonic (statistical-pattern-rec- 
ognition) model for isolated word recognition. The three basic com- 
ponents of the model include: 

(t) Feature measurement in which the speech signal is analyzed to 
provide a set of Q features (e.g., filter bank energies, LPc coefficients, 
etc.) once every M samples. If the isolated word is of duration L x M 
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Fig. 1—Block diagram of standard isolated word-recognition model. 


samples, then a total of L sets of features characterize the word. The 
matrix of Q@ x L features is called the test pattern. 

(tt) Pattern similarity measurement in which a score (similarity or 
distance) relating the similarity of the test pattern to each of a set of 
V reference patterns is computed. Pattern similarity involves both 
time alignment (registration) of the test and reference pattern, and 
distance computation along the alignment path. The output of the 
pattern similarity box is a set of V distance scores, i.e., one for each 
reference pattern. 

(tii) A decision rule in which the distance scores are used to provide 
an ordered (by distance) list of recognition candidates. Generally, the 
candidate with the smallest distance is chosen as the “recognized 
word.” 

Rather than dwelling further on the canonic model we now review 
the Lpc implementation of this model, as we will be relying on this 
structure throughout this paper. We will return to the canonic model 
in Section 2.2 when we discuss its limitations and propose the improved 
model. 


2.1 The LPC-based implementation of the word recognizer 


Figure 2 is a block diagram of the feature measurement for an LPC- 
based analyzer. The digitized speech signal (digitized at a 6.67-kHz 
rate) is first preemphasized using a simple first-order digital network 
and then blocked into overlapping frames of N (300) samples with 
consecutive frames overlapping by 200 samples. Thus, a frame spacing 
of M = 100 samples is used (i.e., 67 frames/second). Each speech frame 
is then windowed by a 300-sample Hamming window, and a pth-order 
(p = 8) autocorrelation analysis is performed. A full Lec analysis 
(using the autocorrelation method") is then performed giving the set 
of (p + 1) LPc coefficients as the features for each frame. 

The pattern similarity processing is carried out using a dynamic 
time-warping (DTW) algorithm in which the test pattern is simultane- 
ously time aligned with each reference pattern, and a distance along 
the time-alignment path is computed. One of the major features of this 
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Fig. 2—Block diagram of LPc analysis system. 


processing is the local distance measure, which relates the distance 
between a frame of the test pattern and a frame of the reference 
pattern, of the form” 


(1) 


t 
d(T, R) = log ee 


arVrar 


where ar and ar are the Lec feature sets of reference and test, 
respectively, and Vr is the autocorrelation coefficient set of the test. 
The distance measure of eq. (1), called the LPc log-likelihood measure, 
can be computed using only ( p + 1) multiplications and additions, and 
one logarithm.’ Furthermore, the upc distance of eq. (1) has been 
shown to have reliable and well understood statistical properties.'*”° 
In particular, if both ar and ar are derived from the same underlying 
stationary random process, then d(T, R) is precisely x’ distributed 
with p degrees of freedom. This statistical behavior of the LPc distance 
holds for fricative sounds. For voiced speech, although the model is 
inexact on a frame-by-frame basis, the statistical properties are ap- 
proximately correct on a time-average basis. 

To compute the pattern similarity between the test and each refer- 
ence pattern using the pTw algorithm with the distance measure of eq. 
(1), a solution to the minimization of 


NT 
D* = oo » a(T,, Rew) (2) 
w(n n=1 

must be found where NT is the number of frames in the test, and w(n) 
is the warping function relating frame n of the test to frame w(n) of 
the reference. Efficient recursive procedures for solving eq. (2) have 
been described in the literature.’*'°"° 

Finally, the decision box orders the distance scores for each reference 
pattern and chooses either the reference with the minimum distance 
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(the nearest neighbor rule) or the reference whose average of the K- 
best scores (for multiple-template systems) is minimum (the K-nearest 
neighbor rule) as the recognized word. When the recognized word is 
unique (i.e., only a single reference gets a small distance score), this 
simple decision rule is sufficient. However, for complex vocabularies, 
generally several references achieve small distance scores, and reliable 
recognition using the smallest distance cannot be achieved. In such 
cases a two-pass decision rule'' has been shown to increase accuracy 
by deferring the final recognition decision to a discrimination analysis 
in a second pass of the decision rule. This discrimination analysis has 
only been applied to speaker-independent systems because of the 
problems associated with obtaining appropriate word discrimination 
weights." 


2.2 Strengths and limitations of the word-recognition model 


The strengths of the canonic word-recognition model of Fig. 1 are as 
follows: 

(t) It is invariant to different speech vocabularies, users, feature 
sets, pattern similarity algorithms, and decision rules. 

(iz) It is easy to implement. 

(tiz) It works well in practice. 

The weaknesses of the model include: 

(t) The feature analysis only adequately represents long-time sta- 
tionary events in the speech signal; nonstationary and transient events 
are only poorly represented. 

(tt) The model does not perform well for complex vocabularies with 
acoustically similar words. 

We now consider the first weakness of the model. By way of example 
Fig. 3 shows waveform plots of the beginning regions of two distinct 
words. Word 1 shows a silence followed by the onset of voiced speech. 
Word 2 shows a short (15 ms) transient of low-level, unvoiced speech 
(e.g., a plosive sound) followed by the onset of voiced speech. Figure 
3 also shows the placement of the first two long-time speech segments 
(frames), which contain identical data except for the first 15 ms of the 
first segment, in which one frame has silence and one frame has a short 
plosive. It should be clear that for a long-time analysis such as the LPC 
model of Section 2.1, the low-level differences in the first 15 ms of 
frame 1 will be swamped out by the high-level voiced speech in the 
last 30 ms of the frame. Thus, in a long-time stationary framework 
accurate recognition of differences between short transients and other 
nonstationary regions (e.g., as occur during onsets and offsets of 
voicing) is greatly limited: Thus, to ameliorate this weakness, the 
feature-detection algorithm must be enhanced to include some repre- 
sentation of short-time nonstationary events. 
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Fig. 3—How a short transient in a word can be swamped out by a voiced region in the 
long-time analysis model. 


Consider now the second weakness of the model. The reason that 
acoustically similar words are easily confused is that the pattern- 
similarity measure (the pTw distance) gives equal weight to all frames 
of the word. For differentiating words of one equivalence class from 
words of another equivalence class this procedure is reasonable. How- 
ever, within a class of acoustically similar words a discrimination 
analysis rather than a straight recognition is required. Such an analysis 
has been proposed by Rabiner and Wilpon" for the case of speaker- 
independent recognition of words. 

For speaker-trained recognizers this two-pass decision rule must be 
modified so that the optimal weighting curves for word discrimination 
could be obtained directly from the robust training procedure.” 

With the incorporation of the expanded feature analysis, a modified 
DTW algorithm, and an expanded decision rule, the basic weaknesses 
of the canonic word recognizer can be overcome to some extent. In the 
next section we describe an “improved” model for word recognition 
and show how the improvements can be incorporated directly into the 
LPc framework of Section 2.1. 


Ill. THE IMPROVED WORD-RECOGNITION MODEL 


Based on the discussion of Section 2.2, the improved word-recogni- 
tion model would have a structure of the type shown in Fig. 4. The 
major differences in the model, from that of Fig. 1, are: 

(i) The feature measurement box is expanded into three sub- 
blocks, namely long-time feature measurements, short-time feature 
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Fig. 4—Block diagram of the improved, isolated word-recognition model using both 
long-time and short-time features, and a two-pass discrimination model. 


measurements, and a stationarity profile. The long-time features are 
essentially those of the original model, although the rate at which they 
are measured will generally be higher for this new model than for the 
original model. The short-time features are intended to characterize 
transients and other nonstationary events in the speech signal. Some 
typical short-time features include zero or level crossing counts over 
short-time intervals, wideband (short-impulse response), filter bank 
analyses, short-time LPC analyses, etc. The stationarity profile decides 
which feature set (either long-time or short-time) is used to character- 
ize a given frame of speech, and hence is used for the distance measure 
of the pattern-similarity algorithm. 

(tt) The ptw algorithm is expanded to use both long-time and 
short-time patterns, for both test and reference pattterns, in determin- 
ing similarity of a given reference pattern to the test pattern. The 
stationarity profile is used to guide the alignment and to choose which 
feature set is used in making a given distance computation. 

(iii) The decision box is implemented as a two-pass decision. In the 
first-pass decision the distance scores for each reference pattern are 
ordered, and if the best distance is smaller than the second best 
distance by a threshold 7*, the decision phase is terminated. If, 
however, the top two or more references are within T™* in distance, a 
second-pass decision rule is used in which the similar words are 
compared using a discriminant analysis and the recognized word is 
chosen on the basis of this analysis. To implement the discriminant 
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analysis, a set of distance-weighting curves discriminating word z from 
word / (for all 7, 7) must be saved along with the reference patterns. 
We now describe how the improved model was implemented in the 
framework of the LPc analysis system. 


3.1 The LPC basic improved word-recognition model 


Using the Lpc analysis framework, the expanded feature measure- 
ment was implemented as follows. The long-time analysis was imple- 
mented as described in Section 2.1 except that the shift parameter, M, 
was changed from M = 100 to M = 33, and the analysis frame length, 
N, was changed from N = 300 to N = 297. Thus, for the long-time 
analysis, analysis frames were computed every 5 ms rather than every 
15 ms, thereby giving a frame rate three times larger. The analysis 
frame was changed to 297 samples so as to be an integral multiple of 
M, the shift parameter. We denote the long-time LPc features as Tr. 

For convenience the short-time analysis was implemented with the 
same processing (i.e., that of Fig. 2) as that of the long-time analysis, 
except that N was changed to 99 (15-ms analysis frames) and M was 
again set to 33 (5-ms frame shifts). The order of the LPc analysis was 
kept at 8 for the short-time as well as the long-time analysis. We 
denote the short-time Lpc features as Tsr. 

To understand how the stationarity profile, p, is generated within 
the framework of the LPc analysis, we must first define a characteri- 
zation of the types of speech segments that are encountered. For this 
purpose we define two binary features that characterize the source and 
the dynamics of the vocal tract. The first feature describes the exci- 
tation for the frame of speech and we denote voiced speech as V, and 
unvoiced speech as V. The second feature describes the vocal tract 
dynamics and we denote the stationary, steady-state case as SS, and 
the nonstationary, time-varying case as SS. Thus, a given frame of 
speech is characterized by the notation (V/V, SS/SS). 

The determination of whether a frame is voiced or unvoiced is fairly 
straightforward and is readily obtained from any number of pitch- 
detection algorithms. The determination of whether a frame is station- 
ary or nonstationary is somewhat more complicated. This computation 
is made as follows. The basic idea is to compare both the long-time 
and short-time features of frames 7 and i, where j represents the frame 
occurring 15 ms before frame v. A distance comparing frames 7 and 7 is 
made as 


a(Trr(t), Tir(7)] + d{Tir(J), Trr(i)] 
+ d[Tsr(i), Tsr(j)] + a[Tsr(J), Tsr(i)] 
4 > 


ai = (3) 


i.e., the average of the long- and short-time Lpc distances between 
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Table |—Feature sets used for similarity determination 


Test Frame Frame Spac- 
Status Feature Set ing Speech Example 
(V, SS) LT analysis 15 ms Vowels, steady-state sounds 
(V, SS) LT analysis 5 ms Onset, offset of voicing transitions 
(V, SS) LT analysis 15 ms Steady fricatives 
(V, SS) ST analysis 5 ms Transients 


frames i and j and between frames / and 1 (recall that the LPc distance 
is not symmetric). The distance score, a;, is then compared with a 
threshold (different for voiced and unvoiced frames), and the station- 
arity value is given as 
1 if V and a:=THV 
ss {1 if V and a:x THU (4) 
0 otherwise, 


where | represents a stationary frame, and 0 represents a nonstationary 
frame, and THV and THU are voiced and unvoiced thresholds, re- 
spectively. 

__ Once » a frame has been characterized with the two-feature code, (V/ 
V, SS/SS), the only remaining step is to specify which feature set and 
frame spacing should be used in the DTw distance computation. 

It should be clear that for voiced frames, (V, —), the long-time 
analysis should be used to avoid potential bias caused by the pitch 
period. Similarly, for all nonstationary frames, (—, SS), a frame spacing 
of 5 ms should be used to track the fast dynamics of such frames. 
Finally, for unvoiced, nonstationary frames, (V, SS), the short-time 
analysis is most appropriate to follow transients and other brief events. 

Table I shows a summary of the feature sets and frame spacings, for 
each of the four types of frames, as used to determine word and 
reference template similarity. 

To illustrate the above analysis, Fig. 5 shows a series of plots of (a) 
the waveform, (b) the log energy (in dB), (c) the pitch, and (d) the 
average of long- and short-time LPc distance [eq. (3)] for the word 
/B/. It can be seen in Fig. 5a that the Lec distance becomes large at 
the beginning of voicing (point A in the plot), at the termination of 
voicing (point B in the plot), and at the end of the word (point C in the 
plot). Such frames (and their neighborhoods) are the nonstationary 
regions of the word, and generally correspond well with transients, 
onset and offset of voicing, and rapidly varying vocal-tract dynamics. 

To determine the stationarity thresholds intelligently, THV and 
THU, histograms of the behavior of a; for voiced and unvoiced frames, 
had to be measured. Such histograms are shown in Fig. 6. The data in 
this figure were obtained by computing a; every 5 ms for all the frames 
of a 39-word vocabulary of letters of the alphabet plus the digits. Based 
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Fig. 5—An example (the word B) showing: (a) the waveform, (b) its energy profile, 
(c) its pitch contour, and (d) the Lec distance comparing adjacent frames. 
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Fig. 6—Histograms of values of (a) Lec distance for voiced speech, and (b) unvoiced 
speech. Thresholds THV and THU are chosen to give desired percentages of nonsta- 
tionary classification. 


on the data of Fig. 6, values for THV and THU can be chosen, so as to 
obtain any desired average probabilities of occurrence of voiced or 
unvoiced classification. For example, if we assume that, on average, 
only 10 percent of the voiced frames should be classified as SS, then 
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a threshold of THV = 0.2 should be used. Similarly, for non-voiced 
frames a threshold of THU = 0.3 yields an average of 10 percent of the 
frames being classified as nonstationary. If the thresholds, THV and 
THU, are both set to infinity, then all frames are classified as stationary 
and hence the feature analysis is essentially identical to that of the 
original model. Similarly, if the thresholds are both set to zero, all 
frames are classified as nonstationary and a 5-ms frame spacing is used 
with both short- and long-time feature sets. 


3.2 Modifications to the DTW algorithm for the improved word model 


As discussed above, the basic changes made in the feature measure- 
ment were inclusion of both short- and long-time LPc analyses, and an 
increase in the frame rate of the analysis from once every 15 ms to 
once every 5 ms. These analysis changes required some modifications 
to the pTw algorithm to properly handle the raw data structure. The 
modifications primarily involve reformulation of the local path con- 
straints to account for the diffferent possible frame spacings (i.e., 
nonuniform sampling in time), and modifications to the distance 
computation to handle both long- and short-time Lrc distances and 
their appropriate weights. 

We denote the long-time test pattern as {T,7(n), n = 1, 2, ---, 
NT}, the short-time test pattern as {Tsr(n), n = 1, 2, ---, NT}, and 
the stationarity distance (on which the stationarity profile is based) as 
{an,n =1,2,---, NT}. Similarly, we denote the long-time reference 
pattern as {Rz7r(m), m = 1, 2, --- , NR} and the short-time reference 
pattern as {Rsr(m), m = 1, 2, ---, NR}. 

We wish to solve for the optimum warping path of the form m = 
w(n), defined for values of n that satisfy either of the following 
conditions: 


(n-—1)03=0 (5a) 
or 
an>TH or ani >TH or an2> TH. (5b) 


Equation (5a) says we solve for m = w(n) at each standard 15-ms time 
slot. This constraint essentially guarantees a grid spacing, between 
adjacent ptw frames, of no more than three frames. It also guarantees 
that, in the limit, as the entire word is classified as stationary, the new 
analysis becomes identical to the previous analysis. Equation (5b) says 
we solve for m = w(n) at each frame, n, in which the stationarity 
distance, a, of that frame or either of its two predecessors falls below 
the specified threshold, TH. (For voiced frames the threshold TH is 
set to THV, and for nonvoiced frames the threshold TH is set to 
THU). Cases in which eq. (5b) is satisfied (i.e., one of the distances is 
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above threshold) correspond to voiced frames with a rapidly changing 
spectrum (transitions), or unvoiced frames with nonstationary excita- 
tion. 

For each frame n that satisfies one of the constraints of eq. (5) we 
must solve the DTW recursion 


Da(n, m) = w(n — nz)d(n, m) 


+ min [D.(n — nz, mo)], m=m=my, (6) 


mi<mosmy 
where 


ny, = last value of n for which a DTW recursion was done. 
ny = next-to-last value of n for which a DTW recursion was 
done. 
w(n — nz.) = weighting function on the local distance to account for 
the nonuniform frame spacing. 
d(n, m) = local frame distance for reference frame m and test 
frame n. 
mz, = smallest value of m at n = ny, from which a valid path 
can go to the grid point (n, m). 
Mn = largest value of m at n = n, from which a valid path 
can go to the grid point (n, m). 
my = smallest value of m at frame n for which DTW recursion 
is solved. 
mu = largest value of m at frame n for which DTW recursion 
is solved. 


The values of mz and my are determined from the global path con- 
straints which specify that all valid DTW paths must lie within a 
parallelogram defined from lines of slope 2 and slope 1/2 beginning at 
grid point (0, 0) and ending at grid point (NT, NR). Thus, m; and my 
satisfy the path constraints 


mr, = max[(n — 1)/2+ 1,2 X (n — NT) + NR, 1] (7a) 
my = min[2 X (n — 1) +1, (n — NT)/2 + NR, NR]. (7b) 


The values of mz and my are those which guarantee that the path to 
grid point (n, m) satisfies the local constraint that the average slope be 
no less than one half nor more than 2. If we define a path increment 
function, A(m), as 


A(m) = increment in m along the best path to grid point (nz, m), 


i.e., if the best path to grid point (n,, m) comes from grid point [nz, m 
— A(m)], then values of mo in the DTw recursion [eq. (6)] must satisfy 
the local path constraint 
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wane A(mo) + (m — mo) = 2(n — Nix). (8) 


Since A(mo) also satisfies the constraint 
A(mo) S 2 (nz — fit), (9) 

we can rewrite the inequalities of eq. (8) as 

m — (n — fiz) 


5 (10a) 


My — A(x) = 


mr, =m— 2(n— nz). (10b) 


Equation (10a) must be checked for each possible m value to find its 
solution, whereas eq. (10b) can be used directly. 
The weighing function w(n — nz) is simply 


w(n — nt) = (n — nz) (11) 


to give more weight to longer frame separations, and the distance 
d(n, m) of the form 


d[Tir(n), Rir(m)] if (V, SS), (V, SS) 
d(n, m) = or (V, SS) (12a) 
d[Tsr(n), Rsr(m)] if (V, SS). (12b) 


The complicated form of the DTw recursion is due to the nonuniform 
sampling rate at which the recursion is solved. If we translate eqs. (6) 
through (12) into words we can say that for each frame n for which the 
recursion is solved we compute Da(n, m) for a range of m from m = mr, 
to m = my, as determined by the global path constraints. For each m 
the optimal path is determined as the weighted local distance, 
d(n, m)w(n — nz), (as determined by the stationarity profile at frame 
n) plus the best accumulated distance to a predecessor frame that is a 
valid candidate for a path to frame m (i.e., 711 S mo S my). The range 
on mo is chosen to guarantee that the local path constraints of a 
warping curve slope of between 1/2 and 2 are met. Since the number 
of frames between the current frame n and the predecessor frame n, 
for which the ptw recursion was last solved, is variable (ranging from 
1 to 3), the local path constraints must use this range, along with 
information as to how much the local path rose [A(m)] at frame (nz, 
mo) to set the local path constraints correctly. 

The DTw recursion of eq. (6) is solved for all valid points from n = 
1 to n = NT, and the total DTw solution is then given as 


D* = Da(NT, NR) (13) 
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and the average path distance is 
Da(NT, NR) 


D= 
NT 


(14) 


3.3 The improved decision rule 


As we discussed earlier a two-pass decision rule is used to improve 
recognition accuracy. The task of the first-pass decision rule is to 
determine the set of vocabulary words that are acoustically similar to 
the test word (i.e., the set of confusions). The task of the second-pass 
decision rule is then to resolve these confusions. 

The key idea behind the operation of the second-pass decision rule 
is that the pTw distance scores between the test pattern and those 
reference patterns that are acoustically close to each other and to the 
test pattern consist of a x” random component and a Gaussian random 
component. The x” random component is associated with the averaging 
of distance scores between frames with the same basic spectrum, and 
therefore has a x’ distribution with p degrees of freedom. The Gaussian 
random difference is associated with the averaging of large distance 
scores between frames with dissimilar spectra. 

In cases where the size of the dissimilar region is small (such as in 
comparing a /B/ to a /D/) compared to the size of the similar region, 
the x’ component distance often outweighs the Gaussian component, 
thereby causing potential recognition errors. 

The purpose of the second-pass decision rule is to enhance the role 
of the Gaussian component associated with spectrally dissimilar re- 
gions in determining the final decision. This is accomplished using a 
distance-weighting function that enhances the discrimination power of 
the frame-by-frame distance scores. 

By way of example, consider a simple confusion list of two references, 
R; and &;, for test word T. Let the ptw frame-by-frame distance and 
warping path be specified as 


d;(n) = d{T(n), Ril w(n)]} (15) 
and 


w;,(n) = Warping path comparing frame n of the test with reference 
Ri. 


We now define two distance-weighting functions, 
{W*/(n),n = 1,2, ---, NR} 
{W?*(n),n = 1, 2, ---, NR;}, 
whete W*/(n) is the weighting to discriminate R; from R;, and Ww?"(n) 


is the weighting to discriminate R; and R;. (Reference 11 shows that 
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these weighting functions are generally not symmetric). We defer a 
discussion of how the weights are generated, in a speaker-trained 
system, to Section 3.4. 

The basic hypothesis is that the test pattern, 7, corresponds to 
either R; or R;, and we wish to come up with a discrimination score 
that aids in this decision. If we define a discrimination score, 6(T, R;| 
T € R,;), as the weighted distance between the T and #;, assuming 
that T actually corresponds to R;, then we get 


NT 
Y W”[wi(k)]d{T(k), R{w(k)]} 


8(T, R;|T € R;) =— ——— (16) 
»? W’[wi(k)] 
and similarly we get 
NT re 
» W*"{w;(k)]d{T(k), Rj[w(k)]} 
8(T, R;|T € Ri) = — _ (17) 


x We [wilh] 


The weighted distance corresponding to the hypothesis T € R; [i.e., 
eq. (16)] is shown in Fig. 7. The frame-by-frame distance is multiplied 
by the weighting function reflected through the warping curve to give 
the discrimination score 6. 

The discrimination distances of eqs. (16) and (17) have the following 
important property. If T and R; are from the same word (different 
replications) then the frame-by-frame distances, d(-, -) are all x’ 
distributed (theoretically) and thus 6(T, R;|T © R,) will be theoreti- 
cally “independent” of the weighting function. If, however, T and R; 
(instead of R;) are from the same word, then 6(T, R:|T € R,) will 
reflect to a greater extent the Gaussian-distributed component of the 
original distance score, d(T, R;), since it primarily consists of distance 
in regions where R; and R; differ significantly, even though they may 
be quite short. 

Thus, in the simple case of a confusion between two references, R; 
and &,, the final decision is made on the basis of the discrimination 
scores of eqs. (16) and (17). 

More generally, if the confusion list associated with test pattern T 
has Q candidates, {Ri,, Ri,, +--+, Ri,}, then the following procedure is 
followed: 

(z) Compute all pairs of discriminations 


5(T, Ri,|T € Ri,), b # a, a, b = 1, 2, «++, QJ. 
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Fig. 7—The time warping plane of a test and reference pattern along with the distance 
of each frame and the weighting curve on distance. 


(it) Form the average discrimination distance 


ie . 1 = 
(7, R;.) Sa »; O(T, R;,|T = Ri,), a= 1, 2, ESS 5 Q. 
Q-1 6 
b#a 
(tit) Define the most likely candidate, R;, as the candidate with the 
minimum average discrimination distance, i.e., 


Omin = min {5(T, R;,)} . 


Similarly, a least likely candidate with maximum distance is defined 
as 


Omax = max {8(T, R;.)} 


(iv) Given the original (i.e., first-pass) distance scores for all Q 
candidates, d(T, R;,), with smallest distance dum and largest distance 
dmax, a second-pass set of distances scores is computed by retaining 
second-pass ordering with first-pass distances. This procedure is illus- 
trated in Fig. 8. A reference with second-pass discrimination score 
5(T, Ri) is given distance d(T, R;) by linearly interpolating along the 
line of Fig. 8. 


3.4 Determination of the weighting curves in the speaker-trained case 


The determination of the weighting curves, W’" and W™, is readily 
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Fig. 8—Linear transformation between second-pass distance score and second-pass 
discrimination score. 


performed in the training phase for speaker-trained systems. Given 
reference templates R; and R;, as obtained using the robust training 
procedure of Rabiner and Wilpon,” a simple way of obtaining W’” is 
to warp R; to R,, giving 


Wi(k) = d{R,(n), Rilw,(n)}}, (18) 


where w,(n) denotes the warping path. Thus, the frame weights (W) 
are essentially the frame-by-frame warped DTw distances between the 
reference templates. Figure 9 shows weighting functions for references 
corresponding to the words /I/ and /Y/. When compared with the 
speaker-independent weights of Rabiner and Wilpon,’ we immediately 
see the statistical effects of small samples. It is evidence that the 
curves of Fig. 9 need some smoothing to reduce the statistical variance. 
The resulting of applying a 3-point smoother (a triangular window) to 
the data of Fig. 9 is given in Fig. 10. A good deal of the statistical 
variation in the curves is smoothed out. 

An alternative, more statistically meaningful, way of obtaining 
smoother weighting curves is to use all P replications of each word in 
the training set to determine the weights. Basically, we obtain a 
weighting function for each pair of training tokens such that each 
token is close in distance to the appropriate reference. The final 
weighting curve is then obtained by averaging the individual weighting 
curves, with appropriate time alignments. We use the term subweights 
to denote the set of weights obtained by averaging all training tokens, 
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Fig. 9—Typical weighting curves for (a) J vs Y, and (b) Y vs J, derived from the 
robust training tokens. 


and we use the notation S to refer to this set. Figure 11 illustrates the 
(sub) weighting curves for J, Y comparisons based on a set of five 
training tokens for each word. 


IV. EXPERIMENTAL EVALUATION OF THE IMPROVED MODEL 


To measure the performance of the improved, Lpc-based, isolated 
word-recognition model, a small evaluation test was performed. Each 
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Fig. 10—Smoothed weighting curves for (a) I vs Y, and (b) Y vs J, derived from the 
robust training tokens and a 3-point smoother. 


of four talkers (two male, two female—all experienced with speech- 
recognition systems) trained the recognizer on a 39-word alpha-digit 
vocabulary by saying each vocabulary word five times during the 
course of a single training session. The word-reference patterns, the 
normal discrimination weights, W, and subweights, S, were determined 
from the training data using the robust training procedure of Rabiner 
and Wilpon.” . 
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Fig. 11—Subweight curves for (a) I vs Y, and (b) Y vs J, derived from using all 
training tokens. 


For evaluation purposes the 39-word vocabulary was spoken 10 
additional times by each of the four talkers in two distinct recording 
sessions. Thus, a total of 390 words were used in each recognition test 
for each talker. 


4.1 Recognition test results 


The overall results of the evaluation tests are given in Table II, 
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Table II—Recognition accuracies as a function of the stationarity 
thresholds and the number of recognition passes for the four talkers 





Talker Number 
(THU, THV) 1 2 3 4 
Pass 1 Alone (—00, 00) 94.9 94.9 90.5 86.7 
(—.3, .2) 95.4 94.9 91.8 86.4 
(0., 0) 94.9 94.9 91.5 85.4 
Pass 2 With Weight W (—00, 0) 96.7 95.6 94.1 87.2 
(—.3, .2) 96.4 95.6 94.9 88.5 
(0., 0) 95.6 95.4 94.4 86.4 
Pass 2 With Subweight S (—0, 00) 95.6 95.9 95.4 87.2 
(—.3, .2) 95.4 95.4 96.2 87.9 
(0., 0) 95.4 95.6 95.1 87.2 


which shows recognition accuracy as a function of stationarity thresh- 
olds, talker, and analysis condition. Three analysis conditions are 
shown, namely Pass 1 alone (no discriminant analysis), Pass 2 with 
weights, W, derived from single reference tokens, and Pass 2 with 
subweights, S, derived from all reference tokens. 

The results of using Pass 1 alone show only a 0.4-percent improve- 
ment, on average, in recognition accuracy for the four talkers when 
comparing the old stationary model (where THU = —w, THV = ~) 
with the new stationary model (where THU = —0.3, THV = 0.2). 

The results of using Pass 2 with weights W show an average of 2.1- 
percent improvement in recognition accuracy for the four talkers over 
the old stationary model (when THU = —0.3 and THV = 0.2). When 
subweights S are used in Pass 2, the improvement in recognition 
accuracy Is an average of 2 percent. 

Table II also shows that when Pass 2 is used the recognition accuracy 
with stationarity thresholds set to (—0.3, 0.2) is, on average, about 0.5 
percent higher than with stationarity thresholds set to (—, ). This 
result indicates that the improved model provides a consistent recog- 
nition accuracy improvement of about 0.5 percent, with or without the 
second-pass weights. 


V. DISCUSSION 


The results presented in Section IV are both encouraging and 
discouraging. They are encouraging in that real improvements in 
recognition accuracy were obtained when a nonstationary analysis 
framework was used in place of the purely stationary framework used 
in earlier work. They are discouraging in that the average improvement 
resulting from the nonstationary model (0.5 percent) was considerably 
smaller than the average improvement resulting from the discrimina- 
tion analysis of the second pass (1.6 percent). 
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There are several points worth noting that have bearing on the 
discussion and results of this paper. The first concerns the anticipated 
improvement in performance for the improved word-recognition 
model. If one carefully considers the sources for recognition errors 
with the alpha-digit vocabulary, it should become clear that the 
anticipated improvement resulting from the nonstationary analysis 
should be small unless some extra weighting is applied to the nonsta- 
tionary regions. This is because words that are strongly affected by 
the nonstationary analysis (e.g., p, d, t, k, etc.) are easily confused with 
similar words in the vocabulary (e.g., b, v, g, a, etc.), and since the 
nonstationary regions are only a small subset of the word patterns, the 
improved analysis will be swamped out by the word-similarity regions. 
This is the original motivation for the discriminant analysis model 
used in the two-pass word recognizer.'’ Hence, the results of Section 
IV, which show a small (but consistent gain) for the improved analysis 
model and a somewhat larger gain for the discriminant model, are 
entirely consistent with the anticipated results given above. 

A second point of note is that the implementation of the improved 
word model was more of a convenient one, rather than one that 
naturally followed from the theory. Thus, the short-time features were 
LPC coefficient sets derived from a short-time window. This implemen- 
tation was straightforward and required only minimal modification of 
the recognizer structure. A more reasonable implementation of the 
short-time analysis in the model would have been something like a 
filter bank model, or a basilar membrane model. Such features would 
then have complemented the long-time Lrc features and would have 
provided a better vehicle for testing and evaluating the improved 
model. The problem with using these alternative short-time feature 
sets is that there is no simple way of combining LPc and filter bank (or 
basilar membrane model) features and deriving from them a distance 
measure with good physical properties. The problem of combining LPC 
and energy features has already been investigated by Brown and 
Rabiner,” and it was shown that no simple metric existed even for 
such a simple case. The main point in the above discussion is that the 
small gain of the improved word model is more impressive when one 
considers the simplicity of the short-time analysis used to provide the 
performance gain. 

The third point of note is the fact that the simple weighting derived 
from the robust training procedure seemed to provide the same per- 
formance improvement as the more sophisticated weighting obtained 
by using multiple tokens in obtaining the weights. The obvious conclu- 
sion to be drawn from the result is that the gain obtained from the 
second pass (which is due primarily to small regions of extreme spectral 
difference) is manifested in any pair of training tokens and that simple 
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smoothing (to eliminate statistical variability) is as good as using 
multiple tokens. 

When one takes into consideration all of the above points, the results 
of Section IV provide a reasonable basis for believing that the improved 
word-recognition model is a reasonable one and that both the nonsta- 
tionary analysis of the first pass, and the discrimination analysis of the 
second pass provide real performance gains. 


VI. SUMMARY 


An improved word-recognition model was proposed in which the 
standard long-time analysis features of the model are combined with 
a set of short-time analysis features. A stationarity index is also 
computed for each speech frame indicating which set of features (long- 
time or short-time) best characterized the current frame of speech. 
Appropriate modifications to the DTW algorithm were required to 
handle the enhanced analysis feature set. Also incorporated in the 
recognition model was a speaker-trained version of the discriminant 
analysis, two-pass model proposed by Rabiner and Wilpon.” 

An evaluation of the model based on an LPc implementation of both 
long-time and short-time feature sets showed the overall improved 
word model had from 1- to 5.7-percent improvement in recognition 
accuracy across four experienced users of speech recognition systems 
using an alpha-digit word vocabulary. On an average the nonstationary 
feature set alone led to a 0.5-percent improvement in accuracy, whereas 
the two-pass discriminant analysis alone led to a 1.6-percent average 
improvement in accuracy. The two improvements were almost inde- 
pendent and the overall recognizer had, on average, a 2.1-percent 
improvement in word accuracy. 

The above results are considered encouraging and indicate that the 
improved model should be considered with alternative short-time 
feature sets. 
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in a Firestopped Cable Bundle 
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A mathematical model of heat transfer in a firestopped vertically 
oriented cable bundle is derived to assist in planning fire test exper- 
iments and to enumerate how changes in geometric and thermophys- 
ical properties affect the temperature rise in the cables when subjected 
to standard furnace fire tests. The analysis indicates that the primary 
heat transfer mode to the cable array is from the flow of hot furnace 
gases up through the void spaces between the individual cables. As 
expected, the most practical and effective way of reducing the heat 
transmission characteristics of a cable bundle is by tightly packing 
the firestop, which reduces the void space between cables and provides 
heat sinking to the cooler environs. 


I. INTRODUCTION 


In this paper a mathematical model of heat transfer in the cable 
bundle of a firestopped vertical cable assembly is developed to assist 
in planning experiments and to evaluate the relative influence of the 
geometrical and thermophysical properties of this portion of the fire- 
stopped configuration. A representation of this complex cable bundle 
geometry in terms of an approximate transient one-dimensional, 
lumped parameter model is obtained through heuristic arguments. 
This is accomplished in a systematic way by first deriving a simplified 
model for a single conductor wire and progressing up in scale, by 
averaging and lumping parameters, to arrive at a heat transfer model 
of a single cable. The heat transfer in a cable bundle is then treated 
using the single-cable model. Cable-to-cable heat transfer is handled 
through boundary conditions at the contact surfaces of the individual 
cables. A set of coupled transient one-dimensional equations results, 
with as many equations as cables in the array. The model is then 
exercised to compute transient temperature distributions within a 
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Fig. 1—ASTM E119 standard temperature vs time curve. 


vertically oriented firestopped cable bundle when exposed to the 
ASTM E119* temperature variation shown in Fig. 1. 

Properties of three generic-type cables (terminating, switchboard, 
and power cable) are used to suggest how physical and geometrical 
characteristics of the cables influence the effectiveness of a firestopped 
cable closure. It should be noted that an absolute evaluation of a 
firestop using the approximate heat transfer model developed here is 
beyond the capability of the model. This can only be attempted when 
considering potential nonlinear combustion modes of the polymeric 
cable materials, which is beyond the scope of this paper. 


ll. HEAT CONDUCTION MODEL OF A SINGLE CABLE 


A cable consists of a core containing the insulated wire conductors 
and an outer protective sheath. In general, the cable core will contain 
a loose array of copper wires (which constitute 40 to 50 percent of the 
core) covered with a thin, polymeric insulating layer. The dominant 
path of heat conduction, because of their high conductivity, is in the 
longitudinal direction of the copper wires. Heat is also transferred 
radially, through the porous array of wires, by virtue of thermal 
radiation and heat conduction through contact points along the length 
of the wires, as shown in Fig. 2. The cable core cannot be considered 
a continuum because of the noncontiguous nature of the wires; hence, 
in developing the governing equation of heat conduction it is desirable 
to consider the individual wires as microstructural elements. This 
concept, as will be seen, permits interaction of the wires and leads, in 


* Standard Methods of Fire Tests of Building Construction and Materials, American 
Society of Testing Materials. 
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Fig. 2—Heat transfer modes in firestopped cables. 


the limit, to an approximate continuum representation of heat con- 
duction in the cable core. This development follows. 

Consider the individual conductor wires as composite cylinders with 
copper wire radius r. (see the appendix for a list of parameters) on 
which is affixed an insulating material with thickness (ra — re), as 
shown in Fig. 3. When radial symmetry prevails, Fourier heat conduc- 
tion equations and boundary conditions for the conductor wire assume 
the following forms: 


OL vi 1 0 OT cx OT cu z > 0 
bel oe ae (- ) = (pC)cu , Osrsr, (1) 
0z r or or ot s6 








for the copper wire conductors, and 


ap T, aT; z>0 
k: a ee gon = (pc)i— , re=re=ra; (2) 
dz” ror\ ar ot t>0 


TRANSIENT HEAT TRANSFER 2315 


er CABLE SHEATH 






_ INSULATED COPPER 
CONDUCTOR WIRES 


——CABLE CORE 


— -INSULATING JACKET (/) 


- 


— COPPER CONDUCTOR WIRE (Cu) 


(b) 


Fig. 3—(a) Macrocoordinate and (b) microcoordinate systems. 


for the insulation, with boundary conditions: 


T-u(0, z, t) = finite, (3) 
Tolhes zy t) = TFs: 2) t),* (4) 

Det oT; 
Rex ila (rc, 2, t) = ki — (Fe, 2, t), (5) 

or or 

oT; 
hi (Way 2, t) = “hil Tila, 2, t) — T(z, t)], (6) 
and initial conditions: 

Teulr, Z, 0) = Ty(r, z, 0) = 0. (7) 


* A perfect contact between the copper wire and insulation is assumed, since the 
insulation is literally molded onto the wire. 
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In the above equations, T(r, z, t) is the temperature, k is the thermal 
conductivity, pc is the volumetric sensible heat capacity, z is the axial 
coordinate along the length of wire, 7 is the radial coordinate, ¢ is the 
time, h; is an effective linear heat transfer coefficient, and T(z, t)isa 
temperature to be assigned subsequently. The subscripts “cw” and “7” 
are assigned to quantities associated with the copper conductor and 
insulation, respectively. Since the diameter of the wire is much smaller 
than its length, it is convenient to express the temperature in the series 


Teulr, 2, t) = To(z, t) + TW(r, z, t) tee, (8) 
and 
Ti(r, z, t) = T(z, t) + T!(r, z, t) Fee. (9) 


The leading term in eqs. (8) and (9) is the average temperature, i.e., 


9 To Td 
T(z, t) = o | rT u(r, z, t)dr + if rT,(r, 2, bdr], (10) 
0 r 


c 


from which it follows that 


Te Td 
| rT @(r, z, t)dr + | rT! (r, z, t)dr = 0, 
0 r 


n=1,2,3,---. (11) 
Substituting series (8) and (9) into differential equations (1) and (2) 


produces a set of recurrent differential equations: 


; erey ‘ laf aT® 36 aT 
cu rT, ai Re a leo, r = c CL ao 9 
a rar\ or p at 





n=1,2,--+- (12) 


; ere» te taf aT (oc) aT 
i| ——z— + —— | r—] | = (pc)i———.. 
de rar\ or alles 


n=1,2,---. (18) 


All T(r, z, t) and T!(r, z, t)n = 1, 2, 3, --- are then expressible in 
terms of the average temperature, T.(z, t). Retaining terms up to first 
order, it follows from (12) and (13) that 
2 
if 


Teulr, Zz, t) = T.(2, =) - ah L™ + A(z, t)lnr + B(z, t), (14) 





cu 


and ‘ 
T(r, z, t) = T,(z, t) — =z L'+C(z, )nr + D(z, t), (15) 


where 
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aT. oT. oe oT, 
LY= ——— a ; 
Reu <a — (pe) - , Li = ki —> — (pc): = 





The variables of integration A, B, C, D are determined from boundary 
conditions, eqs. (3) through (5) and eq. (11). The final form of the 
temperature in the copper wire and insulation in terms of the average 
temperature 7,(z, ¢) is 


Tur, 2, t) = T.(z, t) 


+ 1 2 ré 2 L™@ + 
2-~—>+-r 
dhe \'° Or 


(ra — re) ; (ra+r2) _,; 
+ he a ae ey Ps cu goes t : 
ae ree + 5 L (16) 








= L') 


and 


1 (ra-rt rf) 
Tar 2,0) = Toles) +7 (7 fi a 





8rz, 4 4 
riln(r/ra) |; me re 
+ ————_ (L'- L™) - r2) +——; gi 17 
It now remains to obtain the differential equation for the average 


temperature, 7T.((z, ¢). This equation comes from the satisfaction of 
the boundary condition at r = ra, as shown in eq. (6); 


1 rh; aolri, | re , Mi [ 2re(ra—r2) 
a = | +4. a le 
E ori aria, 279) {a 8r3 hi 


ré 


Reu 





Je ait [To(z, t) — T(z, t)]. (8) 
rd 


Relating the variables in each conductor wire to those of neighboring 
ones in the limit leads to a continuum base from a discrete one. To 
arrive at this continuum description of the cable core consider the 
wires as embedded in a radially symmetric macrocoordinate system, 
as shown in Fig. 3. The center of, say, the nth conductor is located at 
R,, and eq. (18) is written at this point. In keeping with the assumption 
of radial symmetry, the temperature on the right of (18), T(z, t), is 
related to the average temperature of adjacent wiresT’*” andT”’-» 
located at R, + 2rq and Rn — 2ra, respectively. To determine T(z, t) 
the temperatures are weighted with respect to the location of the wire 
in the cable core. This has the effect of enforcing radial symmetry 





Tz, = 7 + ra) Te + (Ra — ra) TS”). (19) 


Thus, if D is used to denote the difference appearing on the right of 
(18), it can be written with the aid of (19) as 
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1 
D=— 5 [(Ra + ra) TO"? — 2RaT? + (Rn ra)T OY]. (20) 


When we expand 7'""» and T“-” in a Taylor series about Rn, eq. 


(20) becomes 
_ ra)? [a ers” 
ana 5 (ie |, a " ovr}. — 








If we substitute (21) into (18) and consider ra « 1 so that only the 
first term on the right of (21) is significant, the final form of the 
continuum representation of the cable core, after the subscript n is 
eliminated, is obtained as: 


2 Pee =. gi v2 2 : Eh ee’: 
1-54. re) jer {5 hi a re) 








ra 4r3k; ry 4r3 k; 
+ Le + — |R—]=0. 
=|} R aR (x =) oe) 


The quantity (4hirz) has the dimensions of conductivity and represents 
the effective radial conductivity of the cable core. The conductivity in 
the axial direction reflects the effect of the microstructure and is given 
as 


2 Ye eee 
a-% E ae oe 














ra 4riki : 
+ (5 + a Poe re) =| Ve (23a) 
and the effective heat capacity as 
(PC)ett, = E = e | (pc); 
+ {5 + a E (ra — 72) + ae (23b) 


Notice that when the conductor wires are assumed insulated, i.e., 
h; = 0, there is no radial heat flow and the effective heat capacity and 
axial conductivity are given, as expected, by the law of mixtures. 
Now that the heat conduction equation for the cable core has been 
determined, it remains to obtain the equation that governs the heat 
transfer in the entire cable including the outer sheath. This develop- 
ment parallels the procedure employed above. The only exception is 
that, since the cable core and sheath are not generally in perfect 
contact, the average temperatures in the cable core and sheath are 
assumed to be different. In many cables the cable core can be pushed 
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through the cable sheath with only a moderate pressure. Since the 
axial conductivity of the cable core is much larger than that of the 
cable sheath, as a simplification, it appears reasonable to assume that 
the axial heat conduction in the cable sheath can be neglected. Thus, 
the expression for the temperature in the cable core and sheath 
assumes the form 


(2R? — R2) 


T.(R, z, t) = T(z, t) + BRE 


Lz, t), O0=R=R. (24a) 
and 


(z, ¢) 





TAR z t)= TO(z, t) : =| n*- em) aT 
? ’ Ai. 


2 








R? R?2 1 1 1 aT? 
+ 5 ince / Re) — ee lntRs/ Re) + 5 || (z, t) — KG a’ 


R.=R=R;, (24b) 


where 


s aT? 


L(z, t) = (pC) est — (kz) et —>—- ez. 





and T(z, t), T(z, t) are the average cable core and cable sheath 
temperatures, respectively. The other parameters are defined in the 
appendix. The average cable core and cable sheath temperatures given 
above are related through the imperfect heat transfer boundary con- 
dition at R = R., namely, 














Ro at = es(To = Ts) | r=R,- (25) 
It follows that 
Re | he? hook? 
2° 8RY Ok, 
—R? 

| ae R In(R,/R.) + 5 | Jone. t) + hesT 9 (z, t) 

_ his(R?2 =" R?2) = hesR? = Riln(R./R.) 4 1 

_ 8k 2k R?—R? 2 


(0) 
= (2,1) + hesTOZ, 2), (26) 





where fA; is the heat transfer coefficient between the cable core and 
cable sheath. 
In the following section eqs. (24b) and (26) are used to determine 
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the heat transfer in an array of sheathed cables that form a bundle. 
However, before this is accomplished it is worthwhile to briefly sum- 
marize the preceding developments. 

The cable core is obviously not a continuum; consequently, a micro- 
structural characterization starting with an individual conductor wire 
was used to derive the governing radially symmetric heat conduction 
equation for the cable core. This derivation produces an effective or 
averaged macroscopic continuum representation of the cable core that 
retains the physical characteristics of the conductor wire microstruc- 
ture. It is, however, anisotropic, since the thermal conductivity in the 
axial and radial directions is not equal. After we obtain the governing 
equations for heat flow in the core, the equations that the cable sheath 
temperature satisfies [eqs. (24b) and (26)] were then determined. 
These were obtained, paralleling the preceding microstructural deri- 
vation, for heat flow predominantly in the axial direction. The radial 
heat transfer is introduced as a perturbation. 


lll, HEAT TRANSFER THROUGH A CABLE BUNDLE 


The objective of this investigation is to analyze heat flow through 
firestopped cable bundles during fire tests. The ASTM E119 temper- 
ature-time history, Fig. 1, provides the fire environment. A typical 
firestopped cable bundle configuration and the above- and below-floor 
coordinate system is illustrated in Fig. 4. 

As previously mentioned, cable penetrations vary in size and in the 
number and type of cables accommodated; to lend a degree of defi- 
niteness to the analysis it is convenient to consider a widely used 
arrangement. A square cable array containing nine cables, as illustrated 
in Fig. 5, is a mathematically manageable configuration, yet most of 
the heat transfer characteristics of larger cable arrays are maintained. 
From the point of view of symmetry, only three cables need be 
considered—the corner cable, to be designated hereafter as cable 1, 
the side cable, to be denoted as cable 2, and the center cable, to be 
called cable 3. 

The section of each cable below the slab (see Fig. 4) is directly 
exposed in a furnace to a fire temperature of up to 1000 degrees 
centigrade and, therefore, after only a short time all polymeric insu- 
lating material is burned away. In the model this effect is approximated 
by assuming that only a loose array of independent copper wires 
projects below the slab. This section of the cable bundle extending up 
through and above the floor slab experiences a different thermal 
environment. A zone of decomposition of the polymetric insulating 
materials occurs and creeps upwards during the extended period of 
exposure to the below-floor fire. This zone of charred and expanded 
insulating material alters the temperature in the void spaces. In the 
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Fig. 4—Typical cable opening firestop test configuration. 


present analysis the effect of this nonlinear phenomenon is ignored to 
render the analysis tractable. It is conservatively assumed here that 
the hot furnace gases move unimpeded in the void space between 
cables. 

The average cable core and average cable sheath temperature for 
each of three cables is determined by performing a heat balance at the 
outer cable sheath surface R = R,. This leads to the following three 
coupled equations: 


aT 
oR 





ke (Rs, 2, t) = 2A-Hea[T? — TY] + (8A¢+ 2A0H [TS — TH) 
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Fig. 5—Cable bundle configuration considered. 


+ AAT; — TY] + 2A/H,Fil TP -— TY] 


+ AjH,F2[T? — TY ]|r=r, (27a) 


T? 
ks = (Rs, 2, t) = 2A-Hesl T? — T?] + (A-Hen + 2A,H,Fw) 


[TS — TE] + 2AM [TP - TP] + (24;+ ADH ATS — T?" 


+ 2A/H,-Fi[T® — T? rar, (27b) 


To 
a (Rs, 2, t) = 4A Heal TY — TY) + 4A,A{T? - T?) 


+ 8A/H,Fx[T® — T®] + 4AH-FofT® — T® | ror. (27) 


In the above equations the parameter A, denotes the fraction of the 
cable sheath perimeter that is in contact with an adjacent cable, and 
A; the portion exposed to the hot furnace gases that flow up through 
the interstices of the cable. It therefore follows that 4(A. + A, = 1. 
Hz is the solid-contact conductance between the cables, H; is the heat 
transfer coefficient between the cables and the firestop material, Hi, is 
the convective heat transfer coefficient between the cable and the hot 
furnace gases that flow up through the void spaces between cables, 
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and H, is a linearized black body radiative heat transfer coefficient 
between the cables. Fiz and F22 are radiation form factors between 
cable 3 and cable 2, and cable 1 and cable 3, respectively. The 
temperatures T. (z, t) and Tv, (z, t) are assumed known and represent 
the temperature distribution in the mineral wool insulating firestop 
material surrounding the cable bundle and the temperature of the 
furnace gases that move up through the void spaces between cables, 
respectively. 

At this juncture, it is convenient to eliminate the independent time 
variable ¢ from the equations by introducing the Laplace transforma- 
tion 


T(z; p) = | T(z, the” dt. (28) 
0 


In what follows, a bar over a variable indicates that the transformation 
(28) has been performed. Substituting eq. (24b) into (27) and using eq. 
(26) to eliminate the average cable sheath temperature, we produce a 
system of three coupled ordinary differential equations for the trans- 


formed average cable core temperatures T'§} , T7$3, and T°}: 


(T,D? — A) TE} — 2(yiD? — 6) TS — (mD? - 4) TY 


= —(Ies + pEy)[ (8A; + 2A)H TP? + AAT”), (29a) 
—2(nD? —0:)T $i + (T2D? — As) T83 — (nD? — &)TY 

= —(hos + pEy)[(2A¢+ AJH/T,? + 2A dT ,”), (29b) 
—4(mD? e)TS} aa (ysD? — 63) TS3 + (['3D° = As) T'S 

= —A(h.s + pE;)ApMiT,’, (29c) 


d 2 
where the operator D? = Fe and the coefficients I, A, y, 6, n, and € are 
Fi 


linear functions of the transform parameter p. After considerable 
algebra the system of differential equations (29) is solved for the three 
average cable core temperatures in the form 


TS = Ciexp(—zVm) + Coexp(—zVme2) + Csexp(—zVmsz) 

—(hes + pE:)[RT;/ + HT;”}. (80) 
The quantities m, mz, and ms; are functions of the Laplace transform 
parameter p and are evaluated as the roots of the characteristic cubic 
equation of the system of equations in (29). The coefficients of inte- 


gration C,, C2, and C3 are determined by matching the solution of the 
cable section below the slab. This is accomplished by enforcing tem- 
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perature and energy continuity for each of the three cables at the 
common boundary Z = 0 and X = L, as shown in Fig. 4. The cable 
sheath temperature evaluated at R = R, follows from (30) in the form 


hes + pB, 


— (1 + pAi) + pD |TY 
aos pA;) + p | 0 


TARs, ze P) = 





2F7AK0) 
Bi(1 + pAi) " Dr act. (31) 


= (hal (hes + pEy) dz? ’ 


where A,, Bi, D, and EF) are constants. 

The temperatures presented in this section are in the Laplace 
transform domain and must be inverted to the real-time regime. The 
functions, however, are too complicated to be inverted in closed form. 
A numerical procedure using the method of quadratures to obtain 
these inversions is discussed in Section IV. 


IV. INVERSION OF LAPLACE TRANSFORM TEMPERATURE SOLUTION 


The form of the transformed temperature solution given in eqs. (30) 
and (31) is much too complicated to use for obtaining a closed-form 
inversion formula. Consequently, we must resort to a numerical inver- 
sion procedure. Most of the methods that appear in the literature 
involve expanding the transformed function in a series that could then 
be inverted term by term using tabulated formulae. Littlewood and 
Zakian’ suggest expanding in a series of Chebyshev polynomials, while 
Longman’ proposes using the Pade table for the Taylor series expan- 
sion of the transformed function. Both of these methods were judged 
to be impractical because of the complexity of the functions to be 
inverted here. 

The method that was finally adopted was developed by Talbot.* The 
inversion of arbitrary transforms is accomplished by a method of 
quadrature along a special contour in the complex plane. The standard 
inversion formula for a transformed function F( p) involves performing 
the following integration in the complex plane: 


1 yt+ico 7 
F(t) = { F(p)e”*dp, (32) 
2at oe 


where p is considered a complex variable, i = (—1)’””, and y is to the 
right of all the singularities of F(p). In Talbot’s method the path of 
integration indicated by (32) is deformed to the path L shown in Fig. 
6, the equation of which is 


p=atié, (33) 
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Fig. 6—Path L for Laplace transform inversion. 


where a = @ cot 0. 

Path L is equivalent to the standard integration path providing that 

(¢) L encloses all singularities of F(p), and 

(ii) | F(p)|— 0 uniformly in Re p = 0 as|p|— ~. 
Condition (ii) holds for the functions considered here. Condition (i), 
in general, may not be satisfied by a given F( p); however, the modified 
function F(Ap + o) for suitable scaling and shift factors A and o can be 
made to conform. In this regard, for the functions to be inverted here 
it can be argued that the singularities exhibited by eqs. (80) and (31) 
are located on the negative real axis; consequently, condition (z) is 
satisfied without resorting to scaling or shift factors. However, to 
accurately perform the inversions over large intervals of time (the 
temperatures are calculated for times up to two hours) a scaling factor 
\ > 1 is necessary. This scaling factor merely shifts the singularity 
along the negative real axis closer to the origin. 

Once conditions (i) and (ii) are satisfied, the inversion formula (32), 
when taken about L assumes the form 


OT 


oe a 
Fije | heen ak. Gate (34) 
2m J__ dé 





Finally, “4” Simpson quadrature with equal intervals 7/n in the 
variable 6 gives the approximation 
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Ne (Re, PUTA OHS Bat Pk PAD 2188} 


ce 3n 


where 
Hy, = H(p)\o, 
0, =ka/n (k=0,---,n—1) 
H(p) = e?"F(Ap + «o)(1 + i), 
and 
B = [0 — % sin(20)]/sin’6. 


The symbol Re indicates that only the real part of the complex 
quantity is taken. Sufficient accuracy is obtained by a suitable choice 
of n, A, and o. The principles of choice are presented by Talbot? and 
will not be discussed here. 

A check of this technique was made by inverting the function given 
in Ref. 2. In all instances satisfactory results were obtained. For 
numerical inversions performed here, it was found sufficient to take 
n = 20,A = 8, ando = 0. 


V. DISCUSSION AND SUMMARY 


The concept of firestopping any type of penetration is synonymous 
with retarding the flow of heat from the fire side of the penetration to 
the unexposed side. The problem rests solely on the identification of 
procedures and materials to seal the space adjacent to the penetrant 
to meet certain standards. Cables by their very nature have excellent 
thermal conduction properties in the longitudinal direction. Very little 
can be done to prevent heat conduction up the array of copper wire 
conductors that make up the center core section of the cable. However, 
it appears reasonable to measure the efficiency of a firestop by the 
temperature rise on the free surfaces on the unexposed side of the 
penetration. If these surfaces are maintained at sufficiently low tem- 
peratures, ignition of combustibles that happen to be in direct contact 
or in close proximity cannot occur. In the case of cable penetrations 
the critical surfaces are the horizontal firestopping material surface on 
the unexposed side and the vertical cable sheath surfaces at the outer 
perimeter of the cable bundle, as shown in Fig. 4. In normal practice 
the firestop material is chosen to be a good thermal insulator and is 
applied with sufficient depth to preclude high temperatures at the top 
surface. Thus, the efficiency of the firestopped geometry will, in 
general, be determined more by the temperature rise on the cable 
sheath surface of the outer perimeter cables than by the firestop 
material. To ascertain the temperature at this critical location, a 
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theoretical transient heat transfer model of a firestopped cable bundle 
was developed. In addition to heat flow through the cable core, the 
model treats the lateral heat flow from the cable core to the cable 
sheath and convective heat transfer to the cable sheath from the flow 
of combustion gases through the interstices between the cables. The 
standard ASTM E119 temperature variation is applied at one end of 
the cable bundle and the temperature distribution of the individual 
cables is computed along the length for up to two hours. To quantify 
these effects, temperature distribution was computed at 10-minute 
intervals for two hours at the 20-cm and 30-cm firestopped depth for 
three different cable types—switchboard, terminating, and power ca- 
ble. For each of these cable types the following parameters were 
varied: 
(i) R;, cable radius 

(it) Ac, fraction of cable sheath surface in contact with an adjacent 
cable 

(iti) hs, conductance between cable core and sheath 

(tv) Hy, conductance between cable bundle perimeter and firestop 
material 

(v) Ap, furnace pressure. 

Some remarks concerning the inclusion of the furnace pressure as a 
parameter are in order. Furnace pressure influences the magnitude of 
the heat transmitted to the cable sheath—by the hot furnace gases 
that travel up through the space between cables—through the heat 
transfer coefficient H,, as shown in eq. (27). The value of Hi; is 
computed from standard empirical formulae‘ once the gas flow velocity 
and flow-channel characteristics are known. The steady-state gas 
velocity is calculated from the furnace pressure assuming that the 
spaces between cables are independent flow channels.” In general, Hi 
increases with increased furnace gas pressure. 

Some of the results of the analysis for a given type cable are given 
in Table I and Fig. 7. The calculated end-point temperature at z = 30 
cm in Fig. 6 and ¢ = 2 hours for the cable sheath and core are given in 
Table I, and a typical temporal temperature distribution is given in 
Fig. 7. The temperatures given in Table I and Fig. 7 should not be 
construed to be indicative of actual measured test values. As previously 
indicated, the model does not take into account nonlinear aspects of 
this obviously complex phenomenon, such as potential combustion 
modes and melting of the polymeric materials that in some instances 
could conceivably, for periods of time, constrict the void of spaces 
between cables and thereby reduce the flow of furnace gases. Never- 
theless, the sensitivity of the cable temperatures to changes in the 
linear cable bundle parameters identified in the model, H;, Ac, Rs, and 
Ap can be calculated and are presented in Table I. 
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Table |—Calculated temperatures at z = 30 cm and t = 2 hours for 
various values of the cable bundle parameters 


Cable Sheath Cable Core 
Temperature Temperature 

H; (°C) Rise (°C) Rise 

(W/cm2— R, 1B nN oe —e—eeeeeeeoS ee 
Case °C) A, (cm) of Water) Cable 1 Cable 3 Cable 1 Cable 3 

1 0.034 0.1 1.27 1 115 164 199 246 
2 0.017 0.1 1.27 1 224 265 293 332 
3 0.017 0.15 1.27 1 170 190 248 270 
4 0.017 0.1 0.635 1 93 115 135 154 
5 0.017 0.1 1.27 0.13 102 115 188 204 


300 


200 


100 


TEMPERATURE RISE IN DEGREES CENTIGRADE 


50 
10 20 30 40 50 60 70 80 90 100 110 = 120 


TIME IN MINUTES 


Fig. 7—Cable core temperature for Case 1 cable bundle parameters given in Table I. 


It is clearly shown in Table I that the cable core temperature exceeds 
that of the cable sheath. A tightly packed firestop that exerts lateral 
pressure on the side of the cable array will provide sufficient heat 
sinking to reduce the temperature in the cable bundle. This physical 
effect is embodied in the contact conductance parameter, H;. The 
larger the numerical value of this parameter, the tighter the firestop 
packing. Cases 1 and 2 of Table I show that increasing this parameter 
indeed results in a lowering of the cable sheath and cable core tem- 
peratures. 

The same general result prevails when furnace overpressure Ap is 
reduced. This effectively reduces the heat transfer coefficient H; 
between the cable sheath and the hot furnace gases. This is demon- 
strated in Table I by comparing the end-point cable temperatures of 
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Cases 2 and 5. Cable temperature can also be reduced by increasing 
the contact surface between cables, as measured by an increase in the 
parameter A., suggesting a tighter and more compact cable bundle. 
The cable temperatures, as observed in Cases 2 and 3, are also lowered. 
This is primarily due to the resulting smaller void space between the 
cables and secondarily to the larger conduction path presented to the 
interior cables. 


VI. CONCLUSIONS 


The following trends were generally indicated: 

(t) The cable core and cable sheath temperature is largest for the 
interior cable (cable 3 of Fig. 5) and the smallest for the corner cable 
(cable 1 of Fig. 5). 

(it) The temperature of cable core exceeds that of the cable sheath. 

(iii) The primary heat transfer mode to the cable sheath is from 
the flow of hot combustion gases through the void space between 
cables. 

(tv) Reducing the void space between the cables by tightly packing 
the cables and/or using smaller diameter cables impedes the flow of 
hot combustion gases and results in a significant reduction of the 
primary convective heat transmission mode. 

(v) A tightly packed firestop capable of providing some heat sink- 
ing to the cooler environs is the most practical and effective method of 
reducing the heat transfer properties of the cable bundle. 

(vi) The magnitude of the cable sheath temperature for similar size 
cables depends on the furnace gas pressure and, to a lesser degree, on 
the firestop depth. 
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APPENDIX 


A.—Fraction of cable sheath surface in contact 
with an adjacent cable. 
A;s— Fraction of cable surface in contact with com- 
bustion gases, 4(A. + Ay) = 1. 
F 2, Fo2— Radiation form factors between cables. 
H,—Heat transfer coefficient between cable and 
combustion gases. 
H.x8—Solid contact conductance between cables. 
H;—Contact conductance between cable bundle 
perimeter and firestop. 
H,— Linearized black body radiative heat transfer 
coefficient. 
h-s—Conductance between cable core and cable 
sheath. 
h;— Conductance between individual copper con- 
ductor wires. 
RS = 4hira—Effective radial thermal conductivity of cable 
core. 
keu— Thermal conductivity of copper wire. 
k;— Thermal conductivity of wire insulation. 
k,— Thermal conductivity of cable sheath. 
(kz)en—Effective axial thermal conductivity of cable 
core. 
Keu = Reu/(pc)cu— Thermal diffusivity of copper wire. 
k; = ki/(oc)i— Thermal diffusivity of wire insulation. 
ks = k,/(oc)x— Thermal diffusivity of cable sheath. 
(kz)ete = (Rz)ett/(pC)err—Effective axial thermal diffusivity of cable 
core. 
L—Length of cable below slab. 
R.— Radius of cable core. 
R;— Overall radius of cable. 
r-— Radius of single copper conductor wire. 
ra— Total radius of single conductor wire. 
Ap—Furnace pressure. 
(pC)cu—Sensible volumetric heat capacity of copper 
wire. 
(pc) e«#— Effective sensible volumetric heat capacity of 
cable core. 
(pc)i:—Sensible volumetric heat capacity of wire in- 
sulation. 
(pc)s—Sensible volumetric heat capacity of cable 
sheath. 
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This paper reviews the major aspects of planning and conducting 
field-tracking studies, including: (i) establishing well-defined, real- 
istic objectives; (ii) designing data collection and analysis procedures 
to meet the objectives; and (iii) ensuring the successful implementa- 
tion of these procedures. The paper gives general guidelines on 
matching study objectives and procedures, as well as detailed infor- 
mation on sample size selection for some common field-study situa- 
tions. Several studies recently conducted by Bell Laboratories Quality 
Assurance Center are used to illustrate the principles of field-study 
planning and implementation. 


|. INTRODUCTION 


It is the function of Bell Laboratories Quality Assurance Center 
(QAC) to provide assurance that telecommunication products pur- 
chased by the Bell Operating Companies (BOcs) are of satisfactory 
quality and perform as required. This assurance is provided through 
the three primary activities of the Quality Assurance effort: 

(t) Quality inspection and auditing at manufacturing, repair, and 
installation locations. . 

(tz) Qualitative feedback gathered through informal contacts with 
BOC personnel and a more formal engineering complaint procedure. 

(zit) Quantitative field-tracking studies of selected products and 
systems. 

This paper discusses the third activity from both a historical and 
tutorial point of view. The authors relate some lessons and principles 
learned through field-tracking studies in the past and offer suggestions 
for those planning to conduct a field-tracking study (FTs) in the future. 

Formal field-tracking studies were undertaken during the 1960s. The 
studies that will be described in this paper began in 1973 with Product 
Performance Surveys (PPS)' on Western Electric station sets. PPSs are 
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designed to track field performance of the sets, identify problems 
quickly, quantify the extent of those problems so that economic 
corrective action can be taken, and assure that the “fixes” are effective. 
Typically, Ppss on station sets are conducted concurrently in five or 
six BOC locations chosen to provide geographic and climatic diversity 
and good representation of a variety of set types. This permits approx- 
imately one million station sets to be tracked at any given time, and 
provides approximately 100,000 trouble events for recording and anal- 
ysis each year. 

Pps data on station sets have been instrumental in detecting and 
quantifying numerous field problems. Representative examples include 
a series of contact contamination problems in Touch-Tone* dials, 
ringer failures in certain premium station sets, and lamp failures in 
key telephone sets. 

The success of Pps has stimulated an increased effort into field 
studies of other products, such as PBX’s, switching networks, channel 
bank equipment, switching machines—just about the entire range of 
telecommunications products purchased by the Bocs. Recently, this 
field-study effort has been extended to include selected general trade 
products manufactured by suppliers other than Western Electric. The 
remaining sections of this paper discuss principles learned by the 
authors in the process of conducting field-tracking studies and offer 
suggestions for those planning to conduct an FTs. 

Section II discusses important considerations in planning an FTs; 
Section III discusses key steps in an FTS implementation program; 
Section IV is devoted to some illustrations from recent Quality Assur- 
ance Center studies. 


Il. PLANNING A FIELD-TRACKING STUDY 


The principal steps involved in planning a successful FTs are: 
(t) Defining study objectives 
(tt) Planning data collection to meet those objectives 

(zit) Planning for successful data analysis. 


2.1 Defining study objectives 


Perhaps the single most important requirement for a successful FTS 
is a clear statement of purpose that has been agreed to by the 
concerned parties. A study will frequently have an impact on many 
different organizations through its implementation, interpretation, and 
the use of its results. The designer, the manufacturer, and the user all 
have legitimate concerns in a given FTs. Obtaining their understanding 
and agreement is an important, but not necessarily a simple, task. 


* Registered service mark of AT&T. 
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Early in the planning of a study, small changes can easily be made 
to accommodate the needs of potential users. But care must be taken 
not to try to answer all questions with a single study. Setting precise 
objectives that simplify implementation can avoid many pitfalls. For 
example, taking all the data that are easily accessible may initially 
seem reasonable, since we certainly don’t want to miss anything that 
might be important. But, trying to ensure that “too many” pieces and 
types of data are good invariably leads to a degraded level of data 
quality. The topic of data collection is discussed in detail in Section 
2.2.2. 

Frequently, objectives change as data are collected. This implies the 
need to provide for such changes initially and to monitor the flow of 
data to determine when such changes are appropriate. For example, a 
study that has the objective of comparing the performance of products 
from three suppliers may quickly show that one supplier is an obvious 
noncontender. Rules for dropping such a candidate could result in a 
more efficient use of resources. 

Objectives can be classified’ as: 

(t) Detecting problems 
(zz) Quantifying known problems 
(iii) Verifying quality audit information or reliability predictions 
(tv) Establishing problem causes 
(v) Measuring the impact of design or manufacturing change(s) 
(vi) Evaluating the product. 
A study can involve aspects of several of these, but procedures must 
be matched to purposes. For example, some studies are intended 
primarily to find and make a preliminary evaluation of problems. Once 
a problem has been identified, a more detailed study can be used to 
better quantify its economic impact. 

Early thinking about a proposed study may be clarified by the 

following list of objectives, stated in a statistical framework: 
(t) Point estimation (e.g., early failure rate) 
(it) Interval estimation (e.g., confidence or prediction intervals) 

(iit) Comparisons (within study, with a standard or with results 
from a previous study) 

(tv) Model testing (e.g., decreasing failure rate) 

(v) Other information (previous list). 
Failure to get agreement on specific objectives among all participants 
can easily lead to continuing disagreements regarding the implemen- 
tation of the study and the interpretation of its results. 


2.2 Planning data collection 


Once the general objectives of a field study have been established, 
the work aimed at meeting those objectives begins with the planning 
of appropriate data collection procedures. 
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Most of this planning is aimed at answering the following questions: 
(t) What data will be collected? (it) How will the data be collected? 
(tit) In what study population will the data be collected? and (iv) How 
much data (sample size) will be collected? Finding the appropriate 
answer to each of these questions for any given study is the key to its 
success. It is worthwhile examining each question separately and 
describing some of the answers that have been found appropriate in 
previous studies. 


2.2.1 What data will be collected? 


There are clearly many factors that will determine what data should 
be collected for any given field study. For purposes of this discussion, 
we assume that the study in question is directed at estimating the 
frequency of troubles occurring in a specified product population. This 
objective imposes the following minimum requirements on the data to 
be collected: 

(t) The data must include the size of the study population. 

(ti) The data must record or count every trouble “event” occurring 
in the study population during a specified time period, and must 
exclude or specifically identify events that are reported but occur 
outside the study population or specified time period. 

Clearly, a field study satisfying only these minimum requirements 
will yield merely gross trouble rate information. However, there are a 
number of situations appropriate for such a minimal study. 

First, for a larger, more detailed study, a preliminary estimate of the 
overall trouble rate is sometimes needed to determine the study 
population size. This topic will be further considered below, in the 
discussion on sample size (Section 2.2.4). Minimal data collection will 
usually suffice for such an estimate. Minimal data collection might 
also be appropriate after a detailed study to monitor the effectiveness 
of corrective actions that may have been taken in response to infor- 
mation obtained during the larger study. 

A minimal program of data collection may also be justified in cases 
where the need for a larger, more detailed and more costly study must 
be demonstrated. Several tracking studies that we have conducted 
were operated in this way, with minimal trouble rate data collected 
until a need for more detailed information was indicated by observing 
higher than expected trouble rates. 

For most field-tracking studies, however, minimal data collection 
falls short of what is needed in two important ways. First, since this 
approach provides no identification of the subpopulation in which any 
trouble occurs, it cannot yield specific trouble-rate estimates by sub- 
populations. Subpopulation, here, refers to a newly manufactured 
versus a repaired product, or to different manufacturing vintages of a 
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given product that may reside within a single overall study population. 
Second, because this approach provides no information on the nature 
of each trouble event, it cannot yield estimates of the frequency with 
which the product under study fails for specific reasons. 

Information on subpopulations and trouble types makes up virtually 
all of the detailed data that must be collected for any study; and 
determining the level of detail for each is a principal objective of study 
planning. 

As noted, subpopulation data would ordinarily include information 
on whether a piece of equipment in which a trouble occurred was 
newly manufactured or repaired, the date of manufacture or repair 
(vintage), service life, and additional descriptive information on the 
product, such as the issue or series number for a product that has 
undergone changes in design or manufacture. (Specifying series or 
issue numbers for circuit packs is an example of detailed product 
specifications used in tracking studies that are currently under way). 
Included, too, under the general heading of subpopulation information 
would be data on how or by whom the trouble was reported, e.g., 
customers or employees. 

In almost all FTs situations the more detailed the data asked for, the 
more complicated and costly the collection process will have to be. 
Therefore, it is important to limit to the extent possible the level of 
detail in subpopulation data requested. The guiding principle in choos- 
ing which characteristics should be included in data collection is 
straightforward: Include only characteristics for which it will be both 
useful and worthwhile to obtain separate subpopulation trouble-rate 
estimates when all the data have been collected. Since almost any 
level of detail can be viewed as potentially useful, the key is to choose 
only those characteristics that produce “partitions” that will be worth- 
while, i.e., that will yield subpopulations of sufficient size to permit 
making accurate trouble-rate estimates and comparisons. In other 
words, do not waste time and money partitioning the trouble data into 
subpopulations so small that the individual data are insufficient to 
yield accurate and, therefore, useful results. 

In many studies it is important to determine precisely when in the 
life of the equipment each trouble occurs. In those cases deciding when 
the lifetime of a product starts (so-called “zero time’) is of crucial 
importance. This is particularly true when early life failure rates are to 
be estimated. For example, does lifetime begin when units arrive, are 
inspected, are installed, or first operated? Dead-on-arrivals may show 
up as defective initially or later in time, depending on the type of 
failure, its effect on the system, the extent of failure detection, and the 
procedure for collecting the data. 

Electronic hardware frequently exhibits a decreasing failure rate 
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during its early life. Here, failures tend to occur closer together during 
the early weeks of operation. Therefore, depending upon the “zero 
time” definition, much of the study’s most useful data can be lost or 
misclassified. Particular care is required in defining zero time if units 
enter the study at different times, are turned on and off for testing, or 
are moved to different locations. 

To relate a real-life incident, one of the authors was recently asked 
to analyze some data from a study where the objective was failure-rate 
estimation after six months of operation. But the records gave only 
the date of installation and failure. Plotting failures against time gave 
very strange results, solely because these units were turned on only 
intermittently and no record of actual operating time on each unit was 
available. In this case the ability to analyze important time-related 
failure characteristics was lost because of insufficient detail in the data 
collected. 

Detailed data on the “nature” of troubles occurring during any study 
generally fall in one of two categories. The first category includes a 
description of the trouble symptoms, the particular portion or com- 
ponent of equipment in which the trouble was observed, and results of 
any detailed failure mode analyses performed on the failed compo- 
nents. The second category of detailed trouble information includes 
data on the particular circumstances or environmental conditions 
associated with any trouble. Whether equipment was observed to be 
initially defective or to fail in-service and usage conditions are examples 
of this second category. Below, we have listed some of the detailed 
items that may be included on the nature of subpopulations: 

(t) Product vintage (date of manufacture or repair) 
(zz) Source (new, repair, etc.) 
(wiz) Length in service 
(tv) Issue, series number, or other product code identifiers. 

Like the subpopulation information, the level of detail required on 
the nature of troubles can have a profound effect on the data collection 
process, including who will be involved in that process. We have listed 
the trouble types as follows: 

(tz) Component or equipment subcode 

(iz) Trouble symptoms 

(uz) Repair analysis results 

(tv) Component failure mode analysis results 
(v) Precise time of failure. 

Obtaining data on failure-mode analyses, for example, may require 
the participation of technical organizations not directly involved in the 
field tracking itself. This, in turn, imposes additional requirements on 
the flow of hardware and paper (trouble tickets, analysis results, etc.) 
for a given study. At the end of this section we will illustrate some of 
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these ideas with examples from recently conducted tracking studies. 
Now, we turn to a closer examination of the question, “How will the 
data be collected?” 


2.2.2 How will the data be collected? 


There are as many answers to this question as there are products to 
be studied. Our aim in this paper, therefore, is to identify goals and 
procedures common to all or most field-tracking situations. 

Probably the best way to start this discussion is the same way it is 
best to start planning a data collection process—by identifying existing 
procedures for recording, collecting, and storing information on the 
field performance of the product under study. It is a rare product on 
which no information is recorded in the field or at a repair center. 
Planning data collection should ideally be viewed as a process of either 
supplementing or tailoring existing data sources to suit the needs of a 
particular FTs. 

At this point it would be helpful to distinguish between data collec- 
tion carried out in the field (i.e, where the product under study is 
used), and that carried out in repair locations, and to discuss each 
separately. 

In most tracking studies, the collection of field failure data involves 
the use of a trouble ticket that must be completed by people respon- 
sible for maintaining the equipment under study. As noted, completion 
of existing trouble tickets is frequently a part of the regular mainte- 
nance routine, and substitution of a more detailed study ticket, or 
“niggybacking” of the study ticket on an existing form, is preferable to 
burdening maintenance people with a new and separate piece of paper. 
Whether or not a separate or modified existing form is used, there are 
a number of basic rules that govern the design of trouble tickets. First, 
the tickets should be kept as short and as simple as possible. Those 
are the obvious rules. Less obvious, but equally important, are the 
following: Wherever possible, the trouble tickets should be formatted 
in “modular” fashion, with separate sections devoted to different types 
of information—e.g., time and place of the trouble in one section, 
equipment description in another, trouble description in still another. 
The most frequently used modules should appear first and most 
prominently; less frequently used modules should appear later. The 
trouble ticket used in the station set Product Performance Survey 
(pps) (Fig. 1) illustrates these ideas. The top of the ticket gives 
information on when and where a trouble occurred. That information 
is required for each trouble report. Next comes information on the 
nature of the trouble, also needed for each event. Data on the type of 
set or component involved in the trouble come next; however, these 
data are not needed if the equipment in question is returned with the 
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Fig. 1—Station set Product Performance Survey trouble ticket. 


trouble ticket. Finally, the last section of the ticket describes field 
adjustments, used only in those few cases where no hardware is 
returned along with the ticket. 

As this last discussion of the station-set Pps implies, there is more to 
field data collection than the gathering of trouble tickets; there is 
frequently the gathering of failed hardware as well. The design of an 
effective, integrated hardware/trouble ticket data-flow system is as 
important as the design of the trouble ticket itself. The basic objectives 
of the data-flow system are: 

(t) To ensure that each piece of returned hardware reaches the 
designated repair or diagnostic location and, in many cases, the des- 
ignated individual responsible for hardware analyses in the study; and 

(zi) ‘To ensure that the information on the trouble tickets reaches 
the organization responsible for storing and analyzing the trouble data. 

There are other important objectives, as well, primarily related to 
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assuring compliance with study procedures and ensuring that hardware 
analysis results may be uniquely identified with reported trouble 
events. We will discuss the issue of compliance later. The ability to 
associate hardware analysis results with trouble symptom reporting is 
important in tracing down the causes of No Trouble Found (NTF) 
returns (e.g., diagnostics problems). The use of serialized, multipart 
tickets is the prime vehicle for making such associations and will be 
illustrated below. - 

We have already noted that the burden imposed by an FTs on field 
personnel can be minimized by using existing reporting forms, when- 
ever possible. For some products, the burden can be even further 
reduced by exploiting automatic data collection procedures. We in- 
clude in this category fully automatic data collection, such as that 
associated with accessing maintenance channel output of software- 
controlled equipment, and semiautomatic data collection, such as that 
associated with accessing computerized administrative data on cus- 
tomer trouble reports where the initial entry of the data into the data 
base depends on action by customers or field personnel. Access of 
existing data sources such as these has become an increasingly prom- 
inent mode of data collection in field-tracking studies. Access of repair 
location data bases serves an analogous function for hardware-repair 
analysis data. 


2.2.3 In what study population will the data be collected? 


In choosing the study population it is important to explicitly define 
the limits of the inferences to be made from the study. Are the results 
to be applied to all units, all units made in a given period or under 
given conditions, or used in a particular fashion, etc.? If the members 
of the study population received special care, were hand-made, pro- 
duced at one plant, etc., then conclusions beyond these boundaries 
depend upon engineering judgment more than upon statistical infer- 
ence. Confidence intervals reflect variability only in the population 
actually sampled and not from other sources. For example, increasing 
the sample taken in one operating area gives no information regarding 
inter-area differences. When sampling is performed by first selecting 
K operating areas and then sampling only within these, the formulas 
appropriate are those used in cluster sampling.® Here, the intra-area 
and inter-area variability are separated. Of course, looking at inter- 
area differences in detail can indicate important variables (mainte- 
nance procedures, environmental impact, etc.) that could be the focus 
of a follow-up study. Care must be taken before cause and effect 
relationships are assumed because of the multitudes of possible causes 
and interrelationships. As Cox relates:* 
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“Tf we wish to apply the conclusions to new conditions or units, 
some additional uncertainty is involved over and above the un- 
certainty measured by the standard error. The only exception 
... is when the units... are chosen from a well-defined population 
of units by a proper sampling procedure.” 


And later, 


“... it is important to recognize explicitly what are the restric- 
tions on the conclusions of any particular experiment.” 


In any tracking study there is a trade-off between more detailed 
conclusions regarding a smaller population and less detailed conclu- 
sions about a larger one. For example, a study may be aimed at 
determining whether a change in design has improved reliability in 
systems subject to certain load characteristics, or whether an overall 
reliability increase independent of load has occurred. A careful state- 
ment of objectives will greatly assist resolving such questions. 

Once a population of interest has been defined and agreed upon, 
technical sampling questions can be addressed. There are certain 
population characteristics that require special attention. For example, 
if a small proportion of the units contribute a large proportion of the 
events under study, stratification and other specialized techniques may 
be required. Also, considerable gains in efficiency can sometimes be 
realized by the use of ratio or regression estimates. Here, known 
characteristics of products or systems under study are related to the 
characteristics of interest in the study. 


2.2.4 How much data will be collected: sample size considerations 


Selecting the appropriate number of units to be included in an FTS 
is very important. On the one hand, a sample size that is too large may 
add unnecessary expense to the study. On the other hand, a sample 
size that is too small may mean that any statistical test using study 
data may lack sufficient power to draw meaningful conclusions. Several 
authors”®’ have addressed this problem. Reference 5 took the theory 
of Refs. 7 and 8 and transformed it into usable curves; these curves 
will be discussed in general in this section and in detail in the appendix. 

The parameters of interest in a field study are summarized in Table 
I. In cases A and D a sample size will be chosen to control the pre- 
cision of the estimates within certain bounds. In the remaining cases 


Table |—Parameters of interest in a field study 


Proportion Rate 
Estimating one parameter Case A Case D 
Testing hypothesis about one parameter Case B Case E 
Comparing two parameters Case C Case F 
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the sample size will be chosen to control the probability of making 
incorrect conclusions. If we assume that failures associated with a 
proportion occur according to a binomial model and that failures 
associated with a rate occur according to a Poisson model, it is possible 
to develop excellent sample sizing guidelines for each of the cases A 
through F. (A discussion of model selection and use is included in the 
next section.) Each case is discussed in detail, with examples, in the 
appendix. 


2.3 Planning for successful data analysis 


In this section, we consider both the data analysis, itself, and the 
data storage and retrieval procedures that make the analysis possible. 


2.3.1 Model building and data analysis 


It requires no lengthy argument to establish that the payoff from 
any field study comes only with the successful analysis of the data 
from that study. And in a very real sense, all of the detailed planning 
on data collection is aimed at ensuring that at the conclusion of the 
study it will be possible to carry out all of the data analyses appropriate 
to the study objectives. 

In broad terms, there are three things that generally get done with 
field-tracking data. These are: 

(t) Estimating trouble or replacement rates, including the con- 
struction of confidence intervals, where appropriate and practical; 

(iz) Searching the data for anomalies—equipment types or vintages 
that stand out, or trouble causes that stand out; and 

(tit) Making comparisons of product performance among different 
types, or vintages, of equipment. 

Each of these procedures requires careful planning and a close 
linkage between the setting of objectives, the design of the data 
collection process, and the data analysis itself. 

During both planning and implementation of a study, the mechanism 
by which the study objectives, the actual data collection, and the data 
analysis are linked is the statistical “data model.” It is through the 
data model that the nondeterministic (stochastic) nature of the data 
is described, and through the model that statistical inferences on the 
questions of interest to the study are made. 

As noted above, most field-tracking studies concern themselves with 
counts of events (failures, replacements, etc.). It is for this reason that 
the simplest and most frequently used models in field studies are the 
binomial and Poisson models. 

The binomial model relates the number of events of interest (failures, 
say), X, to the total number of “trials” (opportunities for failure), N, 
through the expression: 
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N! 
Probability [X = k] =——-———— p*(1 — p)** = seg 
robability [ ] ANB? p)", k=0,1,-++:,N, 
where p is the probability of a failure on an individual trial. 
The Poisson model relates the number of events of interest, X, to 
the total amount of time during which those events can have occurred, 
t, through the expression: 


(At)*e™ 


Probability [X = k] = Zl 


k=0,1,-=:, 
where 4 is the rate at which the events occur in time. 

Both models assume a uniform probability or intensity of occur- 
rences—from trial-to-trial for the binomial, over time for the Poisson. 
For studies in which a model allowing for changing failure intensity 
seems appropriate (e.g., studies of equipment that may be subject to 
infant mortality), other models such as the Weibull and lognormal are 
commonly employed. Detailed information on the form and use of 
these models may be found in any one of several statistical/reliability 
texts’ and we will not attempt to describe them here. 

None of the models mentioned thus far is equipped to handle data 
collected under changing study conditions (e.g., changing environment, 
age, study locations, etc.), or so-called “nuisance factors.” 

To illustrate the problem of nuisance factors, suppose we wanted to 
compare the replacement of two types of equipment (called “old” and 
“new’”), from a study in which the “old” equipment was observed, in 
one study location, while the “new” equipment was observed in that 
and other study locations. Here, the factor of interest is equipment 
type (old versus new); the nuisance factor is the difference that may 
exist between study locations, which could bias the comparison be- 
tween the old and new equipment. It is at this point that the use of 
relatively sophisticated data-analytic techniques, employing tools such 
as the well-known linear (or log-linear) model, becomes necessary and 
worthwhile. These techniques allow for separating the effects (on 
replacement rates, for example) caused by equipment differences, 
study location differences, etc. and for getting at the factors of interest 
without ignoring potential biases introduced by the presence of nuis- 
ance factors. The use of linear models is well documented in both the 
statistical and engineering literatures.°’? However, when confronted 
with an apparent need to make use of such techniques, the study 
designer and data analyst should seek the assistance of a statistician 
who is thoroughly familiar with the application of these techniques. 

The use of any of the models mentioned above involves making 
some assumptions about the data. For example, as noted, use of the 
binomial or Poisson models assumes a uniform failure probability or 
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intensity. Use of a linear model generally involves some assumptions 
of independence between the way in which different factors affect the 
probability of equipment failure. If those assumptions are violated, the 
resulting data analysis can be invalid and, worse, misleading. For 
example, if the failure intensity changes with time (age) for a given 
type of equipment, use of the Poisson model in analyzing the data on 
that equipment could easily mask important information on both the 
short- and long-term reliability of the equipment. Invalid assumptions 
concerning the independence of various factors employed in a linear- 
model can mask or falsely create the impression of cause-and-effect 
relationships between various factors and the probability of failure. 
Rather than attempt to catalog all of the field-study conditions and 
assumptions associated with the use of any particular model, we will 
give some general guidelines on the choice and use of models in field- 
tracking studies. 

Probably the simplest but most important rule to use in choosing a 
FTS model is “keep it simple.” The more complicated a model is, the 
more parameters it will use that must be estimated during the data 
analysis, and the more assumptions it will require to make that analysis 
valid. As this last discussion implies, there are two additional rules 
that are closely related to the simplicity rule: 

(t) Estimability—Since data analysis, at its core, involves making 
statistical inferences about parameters in the model from the available 
data, it is essential that the model and the collection process be 
matched to ensure that the right data are available in sufficient 
quantities to make inferences about all the parameters of interest. 
This is a point we have already touched on in the discussion on data 
collection. 

(iz) Verifiability—The assumptions implicit in the use of any model 
must be verifiable or the results of the FTs will remain open to doubt. 
In some cases, engineering judgment can be used to justify certain 
model assumptions. In all cases, every effort must be made to verify 
assumptions from the data—either during a procedural trial (see 
Section 3.2 below), or as the first step in the data analysis stage of the 
study. A wide variety of statistical techniques are available for testing 
the uniformity and independence assumptions typically encountered 
in FTS model use; these techniques should be applied with the advice 
and assistance of a trained statistician. 

In summary, successful data analysis is dependent on the choice of 
an appropriate FTS model that is matched to both the actual study 
conditions and to the data collection procedures employed in the 
study. 


2.3.2 Data storage and retrieval 


With the exception of very small-scale studies, involving perhaps 
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fewer than 100 trouble events in all, computerized data storage is a 
great asset—if not a necessity—in permitting complete and timely 
analyses of field-tracking data. There are a number of data systems 
available [for example Data Management System (pMs)*, RAMIS“, 
etc.] that lend themselves to constructing field-tracking data bases. 
Among the factors that must be considered are total eventual size, 
frequency of access required, and most important, flexibility of ac- 
cess—i.e., flexibility in retrieving and summarizing the data by one or 
more characteristics, such as equipment type or vintage, or type of 
trouble. It would be very difficult, for example, to compare the perfor- 
mances of different vintages of a given product if the data retrieval 
system did not permit easy, separate access to the trouble data for 
each vintage. On the other hand, it is important not to confuse a need 
for flexible data access with a need for an elaborate data retrieval 
system that turns out regular, detailed data summaries that display 
results in every conceivable way. The key is to retain flexibility without 
trying to preprogram every possible way of looking at the data. 


lil. FTS IMPLEMENTATION 


In this section we briefly consider several topics in the actual 
implementation of an FTS: 
(t) Developing procedures and training personnel 
(tt) Assuring compliance with study procedures 
(zt) Conducting a procedural trial. 


3.1 Developing procedures and training personnel 


Based on mutually agreed-upon objectives, specific procedures and 
forms for data collection need to be developed. Determining the extent 
of automatic data retrieval, checking the validity of the inputs, deciding 
exactly what data are necessary, etc., are detailed questions that 
require resolution. 

Unless rules are provided to meet contingencies, people tend to 
either make up their own rules or just get discouraged about partici- 
pation in the study. Although all possibilities cannot be provided for, 
care should be taken to anticipate the most common “unusual” events. 
As a default, a space for “additional comments” or “other” on data 
forms will alert the data analyst to the fact that the specified categories 
were ambiguous, not mutually exclusive, not exhaustive, etc. 

The training of the field personnel who will actually perform the 
data collection is a very important step. Hands-on teaching with real 
situations will prepare them for being on their own. Giving them an 


* Data Management System, developed by Bell Laboratories. 
t RAMIS is a trademark of Mathematica, Inc. 
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indication of the reasons for the study and how important their 
participation is can improve their morale and impact on the quality of 
the data collected. A specific procedure to provide continuing contact 
and periodic feedback of results can also be a strong positive stimulus. 


3.2 Compliance 


It is difficult to overemphasize the importance of monitoring com- 
pliance with tracking-study procedures. The basic output of any FTs is 
a measure of the reliability of the equipment under study. In order for 
that measure to be useful and unbiased (by differences in the com- 
pleteness of reporting for different products, trouble types, etc.), all or 
substantially all of the trouble events experienced by the equipment 
must be reported. It is the function of compliance procedures to ensure 
that this is the case. 

Basically, compliance can be checked in one of two ways. If an 
independent (of the FTs) count of trouble events for the equipment 
under study is available, compliance can be checked by comparing 
that count to the number of troubles reported through the study 
procedure. This method is used in the station set PPS, where adminis- 
trative counts of customer trouble reports serve as the independent 
count of station troubles in any PPS study location. If no such count is 
available, but the equipment under study is located in a geographically 
small, reasonably well controlled setting, such as a central office, 
serializing of the equipment under study and periodic mapping of the 
office inventory—when compared to the reported troubles—can serve 
as an effective compliance check. With either procedure, the key to 
maintaining good compliance is fast feedback to the people responsible 
for providing the field data and their management about the degree to 
which study procedures are being followed. It is for this reason, 
principally, that some identity of the field person reporting the trouble 
is included on most field-tracking study tickets. 

As noted earlier, in addition to field data collection, many tracking 
studies involve the collection of data—usually from failure-mode anal- 
yses—at repair locations and/or diagnostic laboratories. Reporting 
forms for such analyses will usually have to be tailored to the particular 
equipment under study. But some of the general principles that govern 
field data collection apply to the hardware failure analysis data as well. 
The flow of hardware and paper must be designed to ensure that (7) 
each piece of hardware returned can be accounted for and checked off 
against reported field troubles, and (zi) individual hardware analyses 
can be associated with reported field trouble symptoms. 


3.3 Procedural Trial 


Once study procedures and forms have, at least tentatively, been 
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developed, a trial is an excellent way to shake out unexpected prob- 
lems. Here, an attempt is made to collect some actual data by people 
who will participate in the real study. Estimates of speed and accuracy 
of filling out forms, difficulties with interpreting procedures when faced 
with real situations, completeness of instructions, and potential use- 
fulness of results are some of the possible outputs. If extensive revision 
of procedures, forms, etc., are required, a second trial may be necessary. 

In addition to testing the data collection portion of the study, a trial 
of the data analysis methodology should also be made with simulated 
or actual data. It is useful to present possible conclusions, with their 
justification, to the users of the study results. Then, a comparison of 
their subjective impressions from the raw data with the quantitative 
results from the statistical analysis can be used to improve both. It is 
also at this point that model assumptions are to be verified or modified 
as needed. 


IV. ILLUSTRATIONS 


In this section, we briefly describe some recent field-tracking studies. 
Perhaps the longest running study is the Product Performance Survey 
on station sets, which we mentioned earlier in this paper. Figure 2 
shows the flow of hardware and data in that study. The trouble ticket 
is shown in Fig. 1. Note the modularized design of the ticket described 
above. Analysis of returned equipment in this study is carried out by 
analysts in the Western Electric Quality Assurance organization who 
are dedicated to the study. These analysts encode the results of their 
analyses, as well as other information on the trouble tickets that 
accompany the returned hardware, for direct entry into a data base. 
Compliance is monitored by comparing the number of Pps trouble 
ticket returns to the total number of trouble reports tracked by 
administrative reporting systems in each study location. 

A second example is illustrated in Figs. 3 and 4, which are the data- 
reporting form and a flow sheet, respectively, for the rts of Northern 
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Fig. 2—Product Performance Survey data flow diagram. 
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Fig. 3—pMs-10 Tracking Study Report. 


Telecom’s DMS-10 switching office. The flow sheet illustrates a point 
discussed in Section II, namely, that numerous organizations are often 
involved in an FTs. Cooperative planning among organizations involved 
played an important role in making this study run smoothly and 
produce meaningful results. The report form shows a completely 
different set of data fields and possible responses than did the prs 
trouble ticket. Just as trouble tickets are compared with local admin- 
istrative data in the station set study, report forms for this FTS are 
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(b) 


Fig. 4—pMs-10 switching system installation tracking study. (a) Routing of informa- 
tion. (b) Routing of study units. 


compared with maintenance and outage data automatically collected 
from the switching machine’s maintenance output channel. 


V. CONCLUSIONS 


In this paper we have discussed several important aspects of plan- 
ning and conducting an FTs. We have shown how careful planning 
beforehand in the areas of data collection, population definition, sam- 
ple size, and stating of objectives is essential. We have also discussed 
means of ensuring that the study is producing the required ongoing 
data. If properly planned and conducted, FTss can and do play a key 
role in assuring the quality and reliability of telecommunication prod- 
ucts. 
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Fig. 5—Minimum sample sizes needed to generate 90-percent confidence intervals. 
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APPENDIX 
Sample Size Selection 


In this appendix we discuss in detail the six cases of sample size 
selection described in Section 2.2.4 of this article. These cases are: 
(t) Estimating a parameter 
(1) Testing a hypothesis about one parameter 
(tit) Comparing two parameters for both proportion and rates. 
Each case is discussed in turn below. The six cases are shown in Table 
I, Section 2.2.4. 


A.1 CaseA 


In Case A we wish to have a sample size to control the precision of 
the estimate of a percentage within certain bounds. The estimation 
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Fig. 6—Minimum sample sizes for ®) = 1 percent. 
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process is subject to imprecision; therefore, it is customary to express 
the estimate as an interval, say 2 to 6 percent, as opposed to a single 
point, say 4 percent. This interval is chosen so that if we were to repeat 
the process of data collection and interval construction, our intervals 
would cover the true, unknown percentage a very large proportion of 
the time. The shorter the interval, the more precise is our estimate. 
This interval will decrease in width as the sample size increases. We 
will then select the sample size before the FTs to obtain an anticipated 
width for our interval after the rrs. Figure 5 shows sample sizes 
necessary to generate 90-percent confidence intervals which are 2A 
wide. The sample size depends on the true percentage. The maximum 
sample size is required when the true percentage is 50 percent. 
Example of Case A: Suppose we are only interested in estimating 
the percentage of units that are initially defective. We think that this 
percentage is less than 15 percent, and we want the estimated interval 
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Fig. 7—Minimum sample sizes for Bo = 2 percent. 
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Fig. 8—Minimum sample sizes for ®p = 5 percent. 


to be at most 6-percent wide. Therefore, A = 3 and we see in Fig. 5 
that a sample of size 400 is required. If we had no idea as to the true 
percentage we would use the maximum sample size for 50 percent, that 
is, 750. Note that the curves are symmetrical about 50 percent. 


A.2 Case B 


In Case B we wish to test the hypothesis that a proportion is less 
than or equal to Bo. We will look at a sample of n units, and make one 
of the two decisions: 

(t) If we see that a number of units less than or equal to c, the 
“acceptance number”, have the trait associated with the proportion, 
then we will accept the hypothesis that the proportion is less than or 
equal to Do. 

(it) If we see that more than c of the units have the trait, then we 
will reject the hypothesis in favor of the alternative that the proportion 
is greater than Dp. 

We wish to structure the test so that if the true value of the 
proportion is ®o, we will make decision 7 a large portion of the time, 
and if the true value of the proportion is ®,, we will make decision iz 
a large portion of the time. The reader more interested in acceptance 
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Fig. 9—Minimum sample sizes for By = 10 percent. 


sampling plans, which is an example of such a situation, should refer 
to a specialized reference, e.g., Ref. 12. 

Figures 6 through 9 show the required sample size for values of Do 
= 1, 2, 5, and 10 percent for 80-, 90-, and 95-percent confidence levels. 
As an example of the use of the curves, let Bp = 1 and ®, = 5 percent. 
We see in Fig. 6 that for a 90-percent confidence level, a sample size of 
100 is needed. 


A.3 Case C 


This case deals with comparing two percentages, call them percent- 
age A and percentage B. These percentages might be similar charac- 
teristics on competing products, or competing designs. For example, 
we might be interested in percentages of circuit packs that are dead- 
on-arrival from two suppliers. After the FTs we may arrive at one of 
three conclusions: 

(t) The two percentages are not significantly different 

(it) Percentage A is larger than percentage B 
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Fig. 10—Minimum sample sizes for comparing two proportions at the 90-percent 
confidence level. 


(tit) Percentage B is larger than percentage A. 

There are certain risks in arriving at incorrect conclusions. The risks 
decrease with increasing sample size. We wish to control, at a low level, 
the risk of not making conclusion (i) when percentages A and B are 
equal. And we wish to control, at a low level, the risk of not making 
conclusion (ii) when percentage A is A larger than percentage B [or, 
similarly, the risk of not making conclusion (iii) when percentage B is 
A larger than percentage A]. Figure 10 gives sample sizes necessary to 
accomplish this at the 90-percent confidence level. 

Example of Case C: Suppose we wish to compare the percentages 
of plug-in units (from two suppliers) that fail during the warranty 
period. Further, we assume that the lower percentage will be less than 
20 percent. We wish to have a high probability of concluding that the 
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OBSERVED FAILURES 
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Fig. 11—Minimum observed failures for estimating a failure rate. 


upper percentage is greater than the lower percentage when the upper 
percentage is 5 greater than the lower. For A = 5 and a lower percent 
of 20, we need to look at 1300 units from each supplier. With no 
knowledge of the true percentages we would use the sample size for 50 
percent, that is, 1700. 


A.4 Case D 


Cases D, E, and F deal with failure rates, as opposed to the percent- 
ages of Cases A, B, and C. (The results for Cases D, E, and F must be 
used subject to the cautions given at the end of this appendix.) Cases 
D, E, and F require the use of two curves. The first curve will tell us 
how many failures we need to see. The second curve will tell us how 
many units must be included in the FTS so that we are reasonably 
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Fig. 12—Minimum sample sizes for failure-rate estimation (6-month interval). 


certain that the failures occur in a prescribed time period. In Case A 
we measure the precision of our estimation by the width of the interval, 
expressed in absolute percentages. In Case D, we will measure the 
precision in terms of relative percentages. For example, if our interval 
is 1500 rits* + 5 percent = 1500 + 75 FITs = (1425, 1575), then we will 
say that the precision is 5 percent. This interval corresponds to (1.25, 
1.38) failures per 100 sets per year. 

Example of Case D: Suppose that we wish to obtain a precision of 
15 percent at the 90-percent confidence level in the estimate of the 
failure rate of a plug-in unit. In Fig. 11 at an abscissa of 0.15 (15 
percent) we see that 120 failures must be observed. Suppose that the 
FTS is to last 12 months and that our reliability prediction gives us an 


* rit = Failures in 10° hours = 8.75 X 10 failures per 100 units per year. 
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Fig. 13—Minimum sample sizes for failure-rate estimation (9-month interval). 


estimated FIT rate of 2500. In Fig. 14 we see that about 7000 units need 
to be included in the study. 

Figures 12 through 15 give required sample sizes for studies of 
lengths 6, 9, 12, and 18 months, and for FIT rates up to 10,000. If some 
other combination is needed, then the following formula should be 
used: 

_F+1645x VF F 


dae re 1 
(72K 10°)\T 2” () 


where 


F is the number of failures, 

d is the prior estimate of the failure rate in FITs (failures in 10° 
hours), 

T is the number of months the study will last, and 

N is the required sample size. 
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Fig. 14—Minimum sample sizes for failure-rate estimation (12-month interval). 


This formula provides 95-percent confidence that the required number 
of failures will be observed. 


A.5 Case E 


In Case E we wish to test the hypothesis that a rate is less than or 
equal to a specified value, V1. Based upon the data observed, we will 
either 

(z) Accept the hypothesis that the rate is less than or equal to V1, 
or 

(zt) Reject the above hypothesis in favor of the alternative that the 
failure rate is greater than V1. 

We wish to structure the test so that if the true value of the rate is 
V1, we make decision i with a high probability and if the true value of 
the rate is (R) V1, we make decision i with a high probability. 

Example of Case E: Suppose we wish to check to see how a newly 
designed part has changed the reliability of a piece of equipment. We 
are satisfied with R = 2 and the 90-percent confidence level. Figure 16 
shows that 15 failures must be observed. 
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Fig. 15—Minimum sample sizes for failure-rate estimation (18-month interval). 


A.6 Case F 


Here we wish to compare failure rates of two competing products. 

At the end of the FTS we can arrive at one of three conclusions: 
(t) Failure rate A and failure rate B are not significantly different 

(v1) Failure rate A is larger than failure rate B 

(zit) Failure rate B is larger than failure rate A 
Again there are risks of arriving at incorrect decisions. As we increase 
the sample sizes, we can decrease these risks. We wish to control, at a 
low level, the risk of not making conclusion (z) when failure rates A 
and B are equal. And we wish to control, at a low level, the risk of not 
making conclusion (zz) when failure rate A is R times as large as failure 
rate B (or similarly the risk of not making conclusion (ziz) when failure 
rate B is R times as large as failure rate A). 
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Fig. 16—Minimum observed failures for a hypothesis test (one failure rate). 


Example of Case F: Suppose we wish to use the FTS to compare 
the failure rates of the channel units of two different suppliers. Suppose 
further that we wish to have a high chance (90-percent probability) of 
concluding that the larger failure rate is larger than the smaller failure 
rate when indeed the larger failure rate is twice the smaller. In Fig. 17 
we see that we need to observe about 36 failures. If the study is to last 
12 months and our reliability prediction yields an estimate of 6000 
FITs, then Fig. 14 shows that a sample size of 900 is required to be 95 
percent certain of observing the required number of failures. That is, 
we need 900 of each supplier’s units in the study. 


A.7 Cautions 


In Cases D, E, and F, if the required number of failures is not 
observed in the nominal time period for the FTs, then the desired 
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Fig. 17—Minimum observed failures comparing two failure rates. 


precision will not be achieved. (This might occur if the reliability 
prediction is in error and yields a higher than actual FIT rate as a 
prediction. If the prediction is much higher than the actual, we will be 
incorrectly led to believe that the required number of failures will be 
observed in a shorter interval than is actually needed.) In this case it 
would be wise to extend the study period until the required number of 
failures is observed. 

The theory developed for Cases D, E, and F requires that the failure 
rate be constant throughout the rts. Even for very large sample sizes, 
the theory is sensitive to departures from this assumption. Therefore, 
if we know that the failure rate is high for one time period (e.g., early 
life) and low for a different time period (e.g., steady state), then we 
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must do a separate analysis on each period, as shown in the following 
example. 

Assume that the early failure period is 3 months. Our reliability 
predictions indicate that the early failure rate will be about 10,000 FITs 
and that the steady-state failure rate will be about 4,000 Fits. We wish 
to obtain a precision of 0.25 at the 90-percent confidence level in 
estimating each of the failure rates in an FTs that we wish to finish in 
6 months or less. What sample size is needed? Figure 11 shows that we 
need to observe 41 failures, that is, we must observe 41 failures in the 
early-life period (months 1 to 3) and 41 failures in the steady-state 
period (months 4 to 6). Use of eq. (1) shows that we need at least 2400 
units for the early-life period and 5980 for the steady-state period. 
Since we need to satisfy both requirements we will need a sample size 
of 5980. 

The example above illustrates another important point. If you want 
to use the FTs to estimate several characteristics, then go through the 
sample size analysis for each characteristic. The Frs will satisfy all 
requirements if it has the maximum of the required sample sizes. 

In Cases B, D, E, and F, curves for several confidence levels are 
placed on one page. However, for Cases A and C, each confidence level 
would take a separate page, so only the 90-percent confidence level 
was given. For other confidence levels, see Ref. 8. 
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LETTER TO THE EDITOR 


Comments on ‘‘Voice Storage in the Network—Perspective and History,’’ 
by E. Nussbaum* 


In a recent article E. Nussbaum discussed the FCC’s rejection of 
AT&T’s petition for waiver to allow the offering of Custom Calling 
Services II in the U.S. under the Computer Inquiry II decision. 
Unfortunately, references were not given to these decisions for the 
benefit of those readers who may wish to learn more about this 
apparent frustration of technology and the policy issues involved. The 
FCC rejection can be found in 88 FCC 2d 1. The Computer Inquiry II 
decision is given in 47 CFR 64.702, adopted in 77 FCC 2d 384 (Final 
Decision) on reconsideration, 84 FCC 2d 50, appeal pending sub nom 
CCIA vs. FCC, Case No. 80-471 (D.C. Cir. 1980). 


Michael J. Marcus 

Acting Chief 

Technical Analysis Division 

Office of Science & Technology 
Federal Communications Commission 


* B.S.T.J., 61, No. 5 (May-June 1982), pp. 811-13. 
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